<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
xmlns:content="http://purl.org/rss/1.0/modules/content/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:atom="http://www.w3.org/2005/Atom"
xmlns:sy="http://purl.org/rss/1.0/modules/syndication/">
	<channel>
		<title>Author at Infinum</title>
		<atom:link href="https://infinum.com/blog/author/petar-curkovic/feed/" rel="self" type="application/rss+xml" />
		<link></link>
		<description>Building digital products</description>
		<lastBuildDate>Fri, 03 Apr 2026 12:58:20 +0000</lastBuildDate>
		<sy:updatePeriod>hourly</sy:updatePeriod>
		<sy:updateFrequency>1</sy:updateFrequency>

					<item>
				<image>
					<url>7895https://infinum.com/uploads/2017/09/superfast-csv-imports-using-postgresqls-copy-0.webp</url>
				</image>
				<title>Superfast CSV Imports Using PostgreSQL&#039;s COPY Command</title>
				<link>https://infinum.com/blog/superfast-csv-imports-using-postgresqls-copy/</link>
				<pubDate>Tue, 05 Sep 2017 13:48:00 +0000</pubDate>
				<dc:creator>Petar Ćurković</dc:creator>
				<guid isPermaLink="false">https://infinum.com/the-capsized-eight/superfast-csv-imports-using-postgresqls-copy/</guid>
				<description>
					<![CDATA[<p>Dealing with various sources of data in web applications requires us to create services that will extract information from CSV, Excel, and other file types. </p>
<p>The post <a href="https://infinum.com/blog/superfast-csv-imports-using-postgresqls-copy/">Superfast CSV Imports Using PostgreSQL&#039;s COPY Command</a> appeared first on <a href="https://infinum.com">Infinum</a>.</p>
]]>
				</description>
				<content:encoded>
					<![CDATA[<div
	class="wrapper"
	data-id="es-233"
	 data-animation-target='inner-items'>
		
			<div class="wrapper__inner">
			<div class="block-blog-content js-block-blog-content">
	
<div class="block-blog-content-sidebar" data-id="es-92">
	</div>

<div class="block-blog-content-main">
	
<div
	class="wrapper wrapper__use-simple--true"
	data-id="es-95"
	 data-animation='slideFade' data-animation-target='inner-items'>
		
			<div class="block-paragraph" data-id="es-93">
	<p	class='typography typography--size-16-text-roman js-typography block-paragraph__paragraph'
	data-id='es-94'
	>
	Dealing with various sources of data in web applications requires us to create services that will extract information from CSV, Excel, and other file types. In that case, it’s best to use some existing libraries, or if your backend is on Rails, use gems. There are many gems with very cool features like <a href="https://github.com/pcreux/csv-importer"><code>CSVImporter</code></a> and <a href="https://github.com/roo-rb/roo"><code>Roo</code></a>. But you can also use plain <a href="http://ruby-doc.org/stdlib-2.0.0/libdoc/csv/rdoc/CSV.html"><code>Ruby CSV</code></a>.</p></div>	</div>

<div
	class="wrapper wrapper__use-simple--true"
	data-id="es-98"
	 data-animation='slideFade' data-animation-target='inner-items'>
		
			<div class="block-paragraph" data-id="es-96">
	<p	class='typography typography--size-16-text-roman js-typography block-paragraph__paragraph'
	data-id='es-97'
	>
	Either way, if those are small CSV files, you will get your job done easily. But what if you need to import large CSV files (~100MB / ~1M rows)?</p></div>	</div>

<div
	class="wrapper wrapper__use-simple--true"
	data-id="es-101"
	 data-animation='slideFade' data-animation-target='inner-items'>
		
			<div class="block-heading" data-id="es-99">
	<h2	class='typography typography--size-52-default js-typography block-heading__heading'
	data-id='es-100'
	>
	The Problem</h2></div>	</div>

<div
	class="wrapper wrapper__use-simple--true"
	data-id="es-104"
	 data-animation='slideFade' data-animation-target='inner-items'>
		
			<div class="block-paragraph" data-id="es-102">
	<p	class='typography typography--size-16-text-roman js-typography block-paragraph__paragraph'
	data-id='es-103'
	>
	For a recent project I worked on, an external system would send a CSV file containing 200k rows every 15 minutes. Aforementioned solutions were simply not good enough; they were slow and ate up a bunch of RAM.</p></div>	</div>

<div
	class="wrapper wrapper__use-simple--true"
	data-id="es-107"
	 data-animation='slideFade' data-animation-target='inner-items'>
		
			<div class="block-paragraph" data-id="es-105">
	<p	class='typography typography--size-16-text-roman js-typography block-paragraph__paragraph'
	data-id='es-106'
	>
	I knew I had to find a more efficient solution.</p></div>	</div>

<div
	class="wrapper wrapper__use-simple--true"
	data-id="es-110"
	 data-animation='slideFade' data-animation-target='inner-items'>
		
			<div class="block-heading" data-id="es-108">
	<h3	class='typography typography--size-36-text js-typography block-heading__heading'
	data-id='es-109'
	>
	First Try: CSVImporter gem</h3></div>	</div>

<div
	class="wrapper wrapper__use-simple--true"
	data-id="es-113"
	 data-animation='slideFade' data-animation-target='inner-items'>
		
			<div class="block-paragraph" data-id="es-111">
	<p	class='typography typography--size-16-text-roman js-typography block-paragraph__paragraph'
	data-id='es-112'
	>
	Because of its simplicity, I first decided to use the <a href="https://github.com/pcreux/csv-importer"><code>CSVImporter</code></a> gem. After installing the gem, the <code>CSVImporter</code> class was defined like this:</p></div>	</div>

<div
	class="wrapper wrapper__use-simple--true"
	data-id="es-115"
	 data-animation='slideFade' data-animation-target='inner-items'>
		
			<div class="block-code">
	<pre class="phiki language-ruby github-light" data-language="ruby" style="background-color: #fff;color: #24292e;"><code><span class="line"><span class="token" style="color: #d73a49;">class</span><span class="token"> </span><span class="token" style="color: #6f42c1;">ForecastCSVImporter</span><span class="token">
</span></span><span class="line"><span class="token">  </span><span class="token" style="color: #d73a49;">include</span><span class="token"> </span><span class="token" style="color: #005cc5;">CSVImporter</span><span class="token">
</span></span><span class="line"><span class="token">
</span></span><span class="line"><span class="token">  model </span><span class="token" style="color: #005cc5;">Forecast</span><span class="token">
</span></span><span class="line"><span class="token">
</span></span><span class="line"><span class="token">  column </span><span class="token" style="color: #005cc5;">:</span><span class="token" style="color: #005cc5;">property</span><span class="token">
</span></span><span class="line"><span class="token">  column </span><span class="token" style="color: #005cc5;">:</span><span class="token" style="color: #005cc5;">date</span><span class="token">
</span></span><span class="line"><span class="token">  column </span><span class="token" style="color: #005cc5;">:</span><span class="token" style="color: #005cc5;">value</span><span class="token">
</span></span><span class="line"><span class="token">  column </span><span class="token" style="color: #005cc5;">:</span><span class="token" style="color: #005cc5;">type</span><span class="token">
</span></span><span class="line"><span class="token">  column </span><span class="token" style="color: #005cc5;">:</span><span class="token" style="color: #005cc5;">location</span><span class="token">
</span></span><span class="line"><span class="token">  column </span><span class="token" style="color: #005cc5;">:</span><span class="token" style="color: #005cc5;">created_at</span><span class="token">
</span></span><span class="line"><span class="token" style="color: #d73a49;">end</span><span class="token">
</span></span><span class="line"><span class="token">
</span></span></code></pre></div>	</div>

<div
	class="wrapper wrapper__use-simple--true"
	data-id="es-118"
	 data-animation='slideFade' data-animation-target='inner-items'>
		
			<div class="block-paragraph" data-id="es-116">
	<p	class='typography typography--size-16-text-roman js-typography block-paragraph__paragraph'
	data-id='es-117'
	>
	Executing an import for a specific file was done like this:</p></div>	</div>

<div
	class="wrapper wrapper__use-simple--true"
	data-id="es-120"
	 data-animation='slideFade' data-animation-target='inner-items'>
		
			<div class="block-code">
	<pre class="phiki language-ruby github-light" data-language="ruby" style="background-color: #fff;color: #24292e;"><code><span class="line"><span class="token" style="color: #005cc5;">ForecastCSVImporter</span><span class="token">.</span><span class="token" style="color: #d73a49;">new</span><span class="token">(</span><span class="token" style="color: #005cc5;">path</span><span class="token" style="color: #005cc5;">:</span><span class="token"> file_path</span><span class="token">)</span><span class="token">.</span><span class="token" style="color: #6f42c1;">run!</span><span class="token">
</span></span><span class="line"><span class="token">
</span></span></code></pre></div>	</div>

<div
	class="wrapper wrapper__use-simple--true"
	data-id="es-123"
	 data-animation='slideFade' data-animation-target='inner-items'>
		
			<div class="block-paragraph" data-id="es-121">
	<p	class='typography typography--size-16-text-roman js-typography block-paragraph__paragraph'
	data-id='es-122'
	>
	The implementation was simple and it worked really well on a test CSV file. Then I decided to test it in production, but it was too slow. Importing took <strong>more than 2 hours</strong> which was fairly bad. The main problem was that each CSV row had to be converted into an <code>ActiveRecord</code> model and had to call <code>#create</code>.</p></div>	</div>

<div
	class="wrapper wrapper__use-simple--true"
	data-id="es-126"
	 data-animation='slideFade' data-animation-target='inner-items'>
		
			<div class="block-paragraph" data-id="es-124">
	<p	class='typography typography--size-16-text-roman js-typography block-paragraph__paragraph'
	data-id='es-125'
	>
	I have tried many different gems and wrote a custom CSV importer by using plain <code>Ruby CSV</code> with batch insert commands. Performance improvements were noticeable and the import time was reduced to around 45 minutes, but it was still too slow. I felt I still wasn’t on the right track.</p></div>	</div>

<div
	class="wrapper wrapper__use-simple--true"
	data-id="es-129"
	 data-animation='slideFade' data-animation-target='inner-items'>
		
			<div class="block-media">
	<div	class="media block-media__media media__border--none media__align--center-center"
	data-id="es-127"
	 data-media-type='image'>

	<figure class="image block-media__image-figure image--size-stretch" data-id="es-128">
	<picture class="image__picture block-media__image-picture">
								
			<source
				srcset=https://infinum.com/uploads/2017/09/superfast-csv-imports-using-postgresqls-copy-1-1400x840.webp				media='(max-width: 699px)'
				type=image/webp								height="840"
												width="1400"
				 />
												<img
					src="https://infinum.com/uploads/2017/09/superfast-csv-imports-using-postgresqls-copy-1.webp"
					class="image__img block-media__image-img"
					alt=""
										height="846"
															width="1410"
										loading="lazy"
					 />
					</picture>

	</figure></div></div>	</div>

<div
	class="wrapper wrapper__use-simple--true"
	data-id="es-132"
	 data-animation='slideFade' data-animation-target='inner-items'>
		
			<div class="block-heading" data-id="es-130">
	<h2	class='typography typography--size-52-default js-typography block-heading__heading'
	data-id='es-131'
	>
	Solution – Use COPY</h2></div>	</div>

<div
	class="wrapper wrapper__use-simple--true"
	data-id="es-135"
	 data-animation='slideFade' data-animation-target='inner-items'>
		
			<div class="block-paragraph" data-id="es-133">
	<p	class='typography typography--size-16-text-roman js-typography block-paragraph__paragraph'
	data-id='es-134'
	>
	After all of these attempts, I finally gave up on Ruby solutions and looked for help from my database.</p></div>	</div>

<div
	class="wrapper wrapper__use-simple--true"
	data-id="es-138"
	 data-animation='slideFade' data-animation-target='inner-items'>
		
			<div class="block-paragraph" data-id="es-136">
	<p	class='typography typography--size-16-text-roman js-typography block-paragraph__paragraph'
	data-id='es-137'
	>
	I found that PostgreSQL has a really powerful yet very simple command called <a href="https://www.postgresql.org/docs/9.6/static/sql-copy.html"><code>COPY</code></a> which copies data between a file and a database table.<br>It can be used in both ways:</p></div>	</div>

<div
	class="wrapper wrapper__use-simple--true"
	data-id="es-141"
	 data-animation='slideFade' data-animation-target='inner-items'>
		
			<div class="lists" data-id="es-139">
	<ul	class='typography typography--size-16-text-roman js-typography lists__typography'
	data-id='es-140'
	>
	<li><strong>to import data from a CSV file to database</strong></li><li><strong>to export data from a database table to a CSV file</strong></li></ul></div>	</div>

<div
	class="wrapper wrapper__use-simple--true"
	data-id="es-144"
	 data-animation='slideFade' data-animation-target='inner-items'>
		
			<div class="block-paragraph" data-id="es-142">
	<p	class='typography typography--size-16-text-roman js-typography block-paragraph__paragraph'
	data-id='es-143'
	>
	Example of usage:</p></div>	</div>

<div
	class="wrapper wrapper__use-simple--true"
	data-id="es-146"
	 data-animation='slideFade' data-animation-target='inner-items'>
		
			<div class="block-code">
	<pre class="phiki language-sql github-light" data-language="sql" style="background-color: #fff;color: #24292e;"><code><span class="line"><span class="token" style="color: #d73a49;">COPY</span><span class="token"> forecasts
</span></span><span class="line"><span class="token" style="color: #d73a49;">FROM</span><span class="token"> &amp;</span><span class="token">#</span><span class="token" style="color: #005cc5;">8217</span><span class="token">;tmp</span><span class="token" style="color: #d73a49;">/</span><span class="token" style="color: #005cc5;">forecast</span><span class="token">.</span><span class="token" style="color: #005cc5;">csv</span><span class="token">&amp;</span><span class="token">#</span><span class="token" style="color: #005cc5;">8217</span><span class="token">;
</span></span><span class="line"><span class="token">CSV HEADER;
</span></span><span class="line"><span class="token">
</span></span></code></pre></div>	</div>

<div
	class="wrapper wrapper__use-simple--true"
	data-id="es-149"
	 data-animation='slideFade' data-animation-target='inner-items'>
		
			<div class="block-paragraph" data-id="es-147">
	<p	class='typography typography--size-16-text-roman js-typography block-paragraph__paragraph'
	data-id='es-148'
	>
	This piece of SQL code will import the content from a CSV file to our <code>forecasts</code> table. Note one thing: it’s assumed that <strong>the number and order of columns in the table is the same as in the CSV file</strong>.</p></div>	</div>

<div
	class="wrapper wrapper__use-simple--true"
	data-id="es-152"
	 data-animation='slideFade' data-animation-target='inner-items'>
		
			<div class="block-heading" data-id="es-150">
	<h3	class='typography typography--size-36-text js-typography block-heading__heading'
	data-id='es-151'
	>
	Results</h3></div>	</div>

<div
	class="wrapper wrapper__use-simple--true"
	data-id="es-155"
	 data-animation='slideFade' data-animation-target='inner-items'>
		
			<div class="block-paragraph" data-id="es-153">
	<p	class='typography typography--size-16-text-roman js-typography block-paragraph__paragraph'
	data-id='es-154'
	>
	Importing a CSV file with ~1M rows now takes <strong>under 4 seconds</strong> which is blazing fast when compared to previous solutions!</p></div>	</div>

<div
	class="wrapper wrapper__use-simple--true"
	data-id="es-158"
	 data-animation='slideFade' data-animation-target='inner-items'>
		
			<div class="block-heading" data-id="es-156">
	<h3	class='typography typography--size-36-text js-typography block-heading__heading'
	data-id='es-157'
	>
	Library Support</h3></div>	</div>

<div
	class="wrapper wrapper__use-simple--true"
	data-id="es-161"
	 data-animation='slideFade' data-animation-target='inner-items'>
		
			<div class="block-paragraph" data-id="es-159">
	<p	class='typography typography--size-16-text-roman js-typography block-paragraph__paragraph'
	data-id='es-160'
	>
	Usually, when writing application code, it’s good to avoid writing raw SQL. By default, <code>ActiveRecord</code> doesn’t support the <code>COPY</code> command but there is a gem which takes care of that. It’s called <a href="https://github.com/diogob/postgres-copy"><code>postgres-copy</code></a>. The gem provides a simple interface for copying data between a database table and a CSV file.</p></div>	</div>

<div
	class="wrapper wrapper__use-simple--true"
	data-id="es-164"
	 data-animation='slideFade' data-animation-target='inner-items'>
		
			<div class="block-paragraph" data-id="es-162">
	<p	class='typography typography--size-16-text-roman js-typography block-paragraph__paragraph'
	data-id='es-163'
	>
	Let’s see an example:</p></div>	</div>

<div
	class="wrapper wrapper__use-simple--true"
	data-id="es-166"
	 data-animation='slideFade' data-animation-target='inner-items'>
		
			<div class="block-code">
	<pre class="phiki language-ruby github-light" data-language="ruby" style="background-color: #fff;color: #24292e;"><code><span class="line"><span class="token" style="color: #6a737d;">#</span><span class="token" style="color: #6a737d;"> Enable COPY command on Forecast model</span><span class="token" style="color: #6a737d;">
</span></span><span class="line"><span class="token" style="color: #d73a49;">class</span><span class="token"> </span><span class="token" style="color: #6f42c1;">Forecast</span><span class="token"> </span><span class="token">&lt;</span><span class="token"> </span><span class="token" style="color: #6f42c1;">ActiveRecord</span><span class="token" style="color: #6f42c1;">::</span><span class="token" style="color: #6f42c1;">Base</span><span class="token">
</span></span><span class="line"><span class="token">  acts_as_copy_target
</span></span><span class="line"><span class="token" style="color: #d73a49;">end</span><span class="token">
</span></span><span class="line"><span class="token">
</span></span><span class="line"><span class="token" style="color: #6a737d;">#</span><span class="token" style="color: #6a737d;"> Run export of table data to a file</span><span class="token" style="color: #6a737d;">
</span></span><span class="line"><span class="token" style="color: #005cc5;">Forecast</span><span class="token">.</span><span class="token" style="color: #6f42c1;">copy_to</span><span class="token"> </span><span class="token" style="color: #d73a49;">&amp;</span><span class="token" style="color: #6a737d;">#</span><span class="token" style="color: #6a737d;">8217;/tmp/forecast.csv&amp;#8217;</span><span class="token" style="color: #6a737d;">
</span></span><span class="line"><span class="token">
</span></span><span class="line"><span class="token" style="color: #6a737d;">#</span><span class="token" style="color: #6a737d;"> Run import from a CSV file to database</span><span class="token" style="color: #6a737d;">
</span></span><span class="line"><span class="token" style="color: #005cc5;">Forecast</span><span class="token">.</span><span class="token" style="color: #6f42c1;">copy_from</span><span class="token"> </span><span class="token" style="color: #d73a49;">&amp;</span><span class="token" style="color: #6a737d;">#</span><span class="token" style="color: #6a737d;">8217;/tmp/forecast.csv&amp;#8217;</span><span class="token" style="color: #6a737d;">
</span></span><span class="line"><span class="token">
</span></span></code></pre></div>	</div>

<div
	class="wrapper wrapper__use-simple--true"
	data-id="es-169"
	 data-animation='slideFade' data-animation-target='inner-items'>
		
			<div class="block-paragraph" data-id="es-167">
	<p	class='typography typography--size-16-text-roman js-typography block-paragraph__paragraph'
	data-id='es-168'
	>
	SQL commands of previous calls are:</p></div>	</div>

<div
	class="wrapper wrapper__use-simple--true"
	data-id="es-171"
	 data-animation='slideFade' data-animation-target='inner-items'>
		
			<div class="block-code">
	<pre class="phiki language-sql github-light" data-language="sql" style="background-color: #fff;color: #24292e;"><code><span class="line"><span class="token" style="color: #6a737d;">--</span><span class="token" style="color: #6a737d;"> Forecast.copy_to &amp;#8217;/tmp/forecast.csv&amp;#8217;</span><span class="token" style="color: #6a737d;">
</span></span><span class="line"><span class="token" style="color: #d73a49;">COPY</span><span class="token"> (</span><span class="token" style="color: #d73a49;">SELECT</span><span class="token"> </span><span class="token" style="color: #032f62;">&quot;</span><span class="token" style="color: #032f62;">forecasts&quot;</span><span class="token">.</span><span class="token" style="color: #d73a49;">*</span><span class="token"> </span><span class="token" style="color: #d73a49;">FROM</span><span class="token"> </span><span class="token" style="color: #032f62;">&quot;</span><span class="token" style="color: #032f62;">forecasts&quot;</span><span class="token">)
</span></span><span class="line"><span class="token" style="color: #d73a49;">TO</span><span class="token"> &amp;</span><span class="token">#</span><span class="token" style="color: #005cc5;">8217</span><span class="token">;</span><span class="token" style="color: #d73a49;">/</span><span class="token">tmp</span><span class="token" style="color: #d73a49;">/</span><span class="token" style="color: #005cc5;">forecast</span><span class="token">.</span><span class="token" style="color: #005cc5;">csv</span><span class="token">&amp;</span><span class="token">#</span><span class="token" style="color: #005cc5;">8217</span><span class="token">;
</span></span><span class="line"><span class="token" style="color: #d73a49;">WITH</span><span class="token"> DELIMITER &amp;</span><span class="token">#</span><span class="token" style="color: #005cc5;">8217</span><span class="token">;;&amp;</span><span class="token">#</span><span class="token" style="color: #005cc5;">8217</span><span class="token">; CSV HEADER
</span></span><span class="line"><span class="token">
</span></span><span class="line"><span class="token" style="color: #6a737d;">--</span><span class="token" style="color: #6a737d;"> Forecast.copy_from &amp;#8217;/tmp/forecast.csv&amp;#8217;</span><span class="token" style="color: #6a737d;">
</span></span><span class="line"><span class="token" style="color: #d73a49;">COPY</span><span class="token"> forecasts
</span></span><span class="line"><span class="token" style="color: #d73a49;">FROM</span><span class="token"> &amp;</span><span class="token">#</span><span class="token" style="color: #005cc5;">8217</span><span class="token">;</span><span class="token" style="color: #d73a49;">/</span><span class="token">tmp</span><span class="token" style="color: #d73a49;">/</span><span class="token" style="color: #005cc5;">forecast</span><span class="token">.</span><span class="token" style="color: #005cc5;">csv</span><span class="token">&amp;</span><span class="token">#</span><span class="token" style="color: #005cc5;">8217</span><span class="token">;
</span></span><span class="line"><span class="token">
</span></span></code></pre></div>	</div>

<div
	class="wrapper wrapper__use-simple--true"
	data-id="es-174"
	 data-animation='slideFade' data-animation-target='inner-items'>
		
			<div class="block-heading" data-id="es-172">
	<h3	class='typography typography--size-36-text js-typography block-heading__heading'
	data-id='es-173'
	>
	Data manipulation</h3></div>	</div>

<div
	class="wrapper wrapper__use-simple--true"
	data-id="es-177"
	 data-animation='slideFade' data-animation-target='inner-items'>
		
			<div class="block-paragraph" data-id="es-175">
	<p	class='typography typography--size-16-text-roman js-typography block-paragraph__paragraph'
	data-id='es-176'
	>
	The <code>COPY</code> command is simple and super fast. However, it has restrictions in some advanced scenarios when importing from CSV:</p></div>	</div>

<div
	class="wrapper wrapper__use-simple--true"
	data-id="es-180"
	 data-animation='slideFade' data-animation-target='inner-items'>
		
			<div class="lists" data-id="es-178">
	<ul	class='typography typography--size-16-text-roman js-typography lists__typography'
	data-id='es-179'
	>
	<li>you must use all of the columns from a CSV file</li><li>problems arise if you want to manipulate the data before it is inserted into the database table.</li></ul></div>	</div>

<div
	class="wrapper wrapper__use-simple--true"
	data-id="es-183"
	 data-animation='slideFade' data-animation-target='inner-items'>
		
			<div class="block-paragraph" data-id="es-181">
	<p	class='typography typography--size-16-text-roman js-typography block-paragraph__paragraph'
	data-id='es-182'
	>
	You can specify <em>mappings</em> between CSV columns and table columns. That means you can have different orders of attributes in a CSV file and the database table, but the table must use all of the columns from a CSV file.</p></div>	</div>

<div
	class="wrapper wrapper__use-simple--true"
	data-id="es-186"
	 data-animation='slideFade' data-animation-target='inner-items'>
		
			<div class="block-paragraph" data-id="es-184">
	<p	class='typography typography--size-16-text-roman js-typography block-paragraph__paragraph'
	data-id='es-185'
	>
	These problems are common, but as always, there are workarounds.</p></div>	</div>

<div
	class="wrapper wrapper__use-simple--true"
	data-id="es-189"
	 data-animation='slideFade' data-animation-target='inner-items'>
		
			<div class="block-paragraph" data-id="es-187">
	<p	class='typography typography--size-16-text-roman js-typography block-paragraph__paragraph'
	data-id='es-188'
	>
	The first one is to <strong>create a temporary table</strong> where you would import the original data from a CSV file. After the <code>COPY</code> command inserts all the data from the CSV file, you can perform a custom <code>INSERT</code> command to transfer data from the temporary table to your original table. Within the <code>INSERT</code> command, you can easily perform data manipulation.</p></div>	</div>

<div
	class="wrapper wrapper__use-simple--true"
	data-id="es-192"
	 data-animation='slideFade' data-animation-target='inner-items'>
		
			<div class="block-paragraph" data-id="es-190">
	<p	class='typography typography--size-16-text-roman js-typography block-paragraph__paragraph'
	data-id='es-191'
	>
	Let’s see an example:</p></div>	</div>

<div
	class="wrapper wrapper__use-simple--true"
	data-id="es-194"
	 data-animation='slideFade' data-animation-target='inner-items'>
		
			<div class="block-code">
	<pre class="phiki language-sql github-light" data-language="sql" style="background-color: #fff;color: #24292e;"><code><span class="line"><span class="token" style="color: #d73a49;">COPY</span><span class="token"> forecasts_import
</span></span><span class="line"><span class="token" style="color: #d73a49;">FROM</span><span class="token"> &amp;</span><span class="token">#</span><span class="token" style="color: #005cc5;">8217</span><span class="token">;tmp</span><span class="token" style="color: #d73a49;">/</span><span class="token" style="color: #005cc5;">forecast</span><span class="token">.</span><span class="token" style="color: #005cc5;">csv</span><span class="token">&amp;</span><span class="token">#</span><span class="token" style="color: #005cc5;">8217</span><span class="token">;;
</span></span><span class="line"><span class="token">
</span></span><span class="line"><span class="token" style="color: #d73a49;">INSERT INTO</span><span class="token"> forecasts
</span></span><span class="line"><span class="token" style="color: #d73a49;">SELECT</span><span class="token"> location_id::</span><span class="token" style="color: #d73a49;">int</span><span class="token">, </span><span class="token" style="color: #d73a49;">value</span><span class="token">, forecast_type, </span><span class="token" style="color: #d73a49;">DATE</span><span class="token">(created_at)
</span></span><span class="line"><span class="token" style="color: #d73a49;">FROM</span><span class="token"> forecasts_import;
</span></span><span class="line"><span class="token">
</span></span><span class="line"><span class="token" style="color: #d73a49;">DELETE</span><span class="token"> </span><span class="token" style="color: #d73a49;">FROM</span><span class="token"> forecasts_import;
</span></span><span class="line"><span class="token">
</span></span></code></pre></div>	</div>

<div
	class="wrapper wrapper__use-simple--true"
	data-id="es-197"
	 data-animation='slideFade' data-animation-target='inner-items'>
		
			<div class="block-paragraph" data-id="es-195">
	<p	class='typography typography--size-16-text-roman js-typography block-paragraph__paragraph'
	data-id='es-196'
	>
	A <strong>second approach</strong> when our attributes in the CSV and our tables don’t match is to read the data from standard input and manipulate it through our application:</p></div>	</div>

<div
	class="wrapper wrapper__use-simple--true"
	data-id="es-199"
	 data-animation='slideFade' data-animation-target='inner-items'>
		
			<div class="block-code">
	<pre class="phiki language-sql github-light" data-language="sql" style="background-color: #fff;color: #24292e;"><code><span class="line"><span class="token" style="color: #d73a49;">COPY</span><span class="token"> forecasts
</span></span><span class="line"><span class="token" style="color: #d73a49;">FROM</span><span class="token"> STDIN;
</span></span><span class="line"><span class="token">
</span></span></code></pre></div>	</div>

<div
	class="wrapper wrapper__use-simple--true"
	data-id="es-202"
	 data-animation='slideFade' data-animation-target='inner-items'>
		
			<div class="block-paragraph" data-id="es-200">
	<p	class='typography typography--size-16-text-roman js-typography block-paragraph__paragraph'
	data-id='es-201'
	>
	In that case, you need to manually read the data, row by row, from a CSV file (e.g. by using plain <code>Ruby CSV</code>) and send the data to <code>STDIN</code> of your database connection.</p></div>	</div>

<div
	class="wrapper wrapper__use-simple--true"
	data-id="es-204"
	 data-animation='slideFade' data-animation-target='inner-items'>
		
			<div class="block-code">
	<pre class="phiki language-ruby github-light" data-language="ruby" style="background-color: #fff;color: #24292e;"><code><span class="line"><span class="token" style="color: #e36209;">db_conn</span><span class="token"> </span><span class="token" style="color: #d73a49;">=</span><span class="token"> </span><span class="token" style="color: #005cc5;">ActiveRecord</span><span class="token">::</span><span class="token" style="color: #005cc5;">Base</span><span class="token">.</span><span class="token" style="color: #6f42c1;">connection</span><span class="token">.</span><span class="token" style="color: #6f42c1;">raw_connection</span><span class="token">
</span></span><span class="line"><span class="token" style="color: #e36209;">copy_statement</span><span class="token"> </span><span class="token" style="color: #d73a49;">=</span><span class="token"> </span><span class="token" style="color: #d73a49;">&amp;</span><span class="token" style="color: #6a737d;">#</span><span class="token" style="color: #6a737d;">8217;COPY forecasts FROM STDIN&amp;#8217;</span><span class="token" style="color: #6a737d;">
</span></span><span class="line"><span class="token" style="color: #e36209;">file_path</span><span class="token"> </span><span class="token" style="color: #d73a49;">=</span><span class="token"> </span><span class="token" style="color: #d73a49;">&amp;</span><span class="token" style="color: #6a737d;">#</span><span class="token" style="color: #6a737d;">8217;/tmp/forecast.csv&amp;#8217;</span><span class="token" style="color: #6a737d;">
</span></span><span class="line"><span class="token">
</span></span><span class="line"><span class="token">db_conn</span><span class="token">.</span><span class="token" style="color: #6f42c1;">copy_data</span><span class="token">(</span><span class="token">copy_statement</span><span class="token">)</span><span class="token"> </span><span class="token" style="color: #d73a49;">do</span><span class="token">
</span></span><span class="line"><span class="token">  </span><span class="token" style="color: #005cc5;">CSV</span><span class="token">.</span><span class="token" style="color: #6f42c1;">foreach</span><span class="token">(</span><span class="token">file_path</span><span class="token">,</span><span class="token"> </span><span class="token" style="color: #005cc5;">headers</span><span class="token" style="color: #005cc5;">:</span><span class="token"> </span><span class="token" style="color: #005cc5;">true</span><span class="token">)</span><span class="token"> </span><span class="token" style="color: #d73a49;">do</span><span class="token"> </span><span class="token" style="color: #d73a49;">|</span><span class="token">row</span><span class="token" style="color: #d73a49;">|</span><span class="token">
</span></span><span class="line"><span class="token">    db_conn</span><span class="token">.</span><span class="token" style="color: #6f42c1;">put_copy_data</span><span class="token">(</span><span class="token">row</span><span class="token">.</span><span class="token" style="color: #6f42c1;">fields</span><span class="token"> </span><span class="token" style="color: #d73a49;">+</span><span class="token"> </span><span class="token">[</span><span class="token" style="color: #005cc5;">Time</span><span class="token">.</span><span class="token" style="color: #6f42c1;">zone</span><span class="token">.</span><span class="token" style="color: #6f42c1;">now</span><span class="token">]</span><span class="token">)</span><span class="token">
</span></span><span class="line"><span class="token">  </span><span class="token" style="color: #d73a49;">end</span><span class="token">
</span></span><span class="line"><span class="token" style="color: #d73a49;">end</span><span class="token">
</span></span><span class="line"><span class="token">
</span></span></code></pre></div>	</div>

<div
	class="wrapper wrapper__use-simple--true"
	data-id="es-207"
	 data-animation='slideFade' data-animation-target='inner-items'>
		
			<div class="block-paragraph" data-id="es-205">
	<p	class='typography typography--size-16-text-roman js-typography block-paragraph__paragraph'
	data-id='es-206'
	>
	<strong>Although the import process goes through Ruby, there is no overhead of instantiating <code>ActiveRecord</code> objects and performing validations.</strong> This is a bit slower than directly importing with the database, but it’s still very fast.</p></div>	</div>

<div
	class="wrapper wrapper__use-simple--true"
	data-id="es-210"
	 data-animation='slideFade' data-animation-target='inner-items'>
		
			<div class="block-heading" data-id="es-208">
	<h2	class='typography typography--size-52-default js-typography block-heading__heading'
	data-id='es-209'
	>
	Can I use this if I’m not on PostgreSQL?</h2></div>	</div>

<div
	class="wrapper wrapper__use-simple--true"
	data-id="es-213"
	 data-animation='slideFade' data-animation-target='inner-items'>
		
			<div class="block-paragraph" data-id="es-211">
	<p	class='typography typography--size-16-text-roman js-typography block-paragraph__paragraph'
	data-id='es-212'
	>
	In addition to <em>PostgreSQL</em>, <strong>other databases also support</strong> native CSV importing.</p></div>	</div>

<div
	class="wrapper wrapper__use-simple--true"
	data-id="es-216"
	 data-animation='slideFade' data-animation-target='inner-items'>
		
			<div class="block-paragraph" data-id="es-214">
	<p	class='typography typography--size-16-text-roman js-typography block-paragraph__paragraph'
	data-id='es-215'
	>
	For example, <em>Microsoft SQL Server</em> uses the <a href="https://docs.microsoft.com/en-us/sql/t-sql/statements/bulk-insert-transact-sql"><code>BULK INSERT</code></a> SQL command which is pretty similar to PostgreSQL’s <code>COPY</code> command, while <em>Oracle</em> has a command line tool called <a href="https://docs.oracle.com/cd/B19306_01/server.102/b14215/ldr_params.htm"><code>sqlloader</code></a>.<em> MySQL</em> also supports CSV file imports with the <a href="https://dev.mysql.com/doc/refman/5.7/en/load-data.html"><code>LOAD DATA INFILE</code></a> command or by using the <strong>mysqlimport</strong> utility.</p></div>	</div>

<div
	class="wrapper wrapper__use-simple--true"
	data-id="es-219"
	 data-animation='slideFade' data-animation-target='inner-items'>
		
			<div class="block-media">
	<div	class="media block-media__media media__border--none media__align--center-center"
	data-id="es-217"
	 data-media-type='image'>

	<figure class="image block-media__image-figure image--size-stretch" data-id="es-218">
	<picture class="image__picture block-media__image-picture">
								
			<source
				srcset=https://infinum.com/uploads/2017/09/superfast-csv-imports-using-postgresqls-copy-2-1400x840.webp				media='(max-width: 699px)'
				type=image/webp								height="840"
												width="1400"
				 />
												<img
					src="https://infinum.com/uploads/2017/09/superfast-csv-imports-using-postgresqls-copy-2.webp"
					class="image__img block-media__image-img"
					alt=""
										height="846"
															width="1410"
										loading="lazy"
					 />
					</picture>

	</figure></div></div>	</div>

<div
	class="wrapper wrapper__use-simple--true"
	data-id="es-222"
	 data-animation='slideFade' data-animation-target='inner-items'>
		
			<div class="block-heading" data-id="es-220">
	<h2	class='typography typography--size-52-default js-typography block-heading__heading'
	data-id='es-221'
	>
	Conclusion</h2></div>	</div>

<div
	class="wrapper wrapper__use-simple--true"
	data-id="es-225"
	 data-animation='slideFade' data-animation-target='inner-items'>
		
			<div class="block-paragraph" data-id="es-223">
	<p	class='typography typography--size-16-text-roman js-typography block-paragraph__paragraph'
	data-id='es-224'
	>
	Native CSV parsers will give you better performance than using plain CSV parsers combined with many <code>INSERT</code> commands. Of course, you should take care of your database validations because you are skipping application validations.</p></div>	</div>

<div
	class="wrapper wrapper__use-simple--true"
	data-id="es-228"
	 data-animation='slideFade' data-animation-target='inner-items'>
		
			<div class="block-paragraph" data-id="es-226">
	<p	class='typography typography--size-16-text-roman js-typography block-paragraph__paragraph'
	data-id='es-227'
	>
	If you’re working with small CSV files, existing libraries are usually the best solution because of their simplicity, especially if you need to do data manipulation before persisting it to the database.</p></div>	</div>

<div
	class="wrapper wrapper__use-simple--true"
	data-id="es-231"
	 data-animation='slideFade' data-animation-target='inner-items'>
		
			<div class="block-paragraph" data-id="es-229">
	<p	class='typography typography--size-16-text-roman js-typography block-paragraph__paragraph'
	data-id='es-230'
	>
	In all other cases, go ahead and use the database tools because they are really powerful and easy to use.</p></div>	</div>
</div>
</div>		</div>
	</div><p>The post <a href="https://infinum.com/blog/superfast-csv-imports-using-postgresqls-copy/">Superfast CSV Imports Using PostgreSQL&#039;s COPY Command</a> appeared first on <a href="https://infinum.com">Infinum</a>.</p>
]]>
				</content:encoded>
			</item>
		
	</channel>
</rss>