Jory Stone's Place
Entries tagged as software
Quicksearch
Archives
Links
Entries tagged as software
Related tags
postgresqlPostgreSQL - Out of Memory with large INSERT
I’ve been working with moving large databases in PostgreSQL and have found that PostgreSQL is a true memory hog when inserting millions of rows in one statement. If the target table has a number of foreign key references PostgreSQL seems to choke horribly and use insane amounts of memory, 1GB per million rows in my original case.
We have a rather simple table that is composed of six integer columns, two timestamps, and one boolean. I first ran into this issue when inserting with a select statement that had a simple join and where clause. Initially I thought the select statement was the one causing the out of memory issues. But I found running the select alone worked and changing the statement to a select into piped the data into a new table without any issues.
I created a simple test case to reproduce this with the following schema,
Running this INSERT / SELECT below can cause the PostgreSQL to keep expanding in memory usage. On a 32-bit machine it aborts due to an out of memory error around 2GB, but on a 64-bit Linux machine it keeps using memory until all of the main memory and swap is full and then the oom-killer process is spawned and starts killing processes.
I've tried this on PostgreSQL 8.3.7 under Linux and I found the same behavior in PostgreSQL 8.4 Beta for Windows.
A simple workaround I found is to simply drop the foreign keys before the insert, and re-add the foreign keys after the data is loaded into the table.
We have a rather simple table that is composed of six integer columns, two timestamps, and one boolean. I first ran into this issue when inserting with a select statement that had a simple join and where clause. Initially I thought the select statement was the one causing the out of memory issues. But I found running the select alone worked and changing the statement to a select into piped the data into a new table without any issues.
I created a simple test case to reproduce this with the following schema,
CREATE TABLE table1 (
table1_id integer NOT NULL,
name character varying NOT NULL,
PRIMARY KEY(table1_id)
);
CREATE TABLE table2 (
table2_id integer NOT NULL,
name character varying NOT NULL,
PRIMARY KEY(table2_id)
);
CREATE TABLE table3 (
table3_id integer NOT NULL,
name character varying NOT NULL,
PRIMARY KEY(table3_id)
);
CREATE TABLE table4 (
table4_id integer NOT NULL,
name character varying NOT NULL,
PRIMARY KEY(table4_id)
);
CREATE TABLE test_target (
table1_id integer NOT NULL,
table2_id integer NOT NULL,
visible boolean NOT NULL,
date_added timestamp with time zone NOT NULL,
date_updated timestamp with time zone NOT NULL,
table3_id1 integer NOT NULL,
table3_id2 integer NOT NULL,
table4_id1 integer NOT NULL,
table4_id2 integer NOT NULL,
FOREIGN KEY (table1_id) REFERENCES table1(table1_id),
FOREIGN KEY (table2_id) REFERENCES table2(table2_id),
FOREIGN KEY (table3_id1) REFERENCES table3(table3_id),
FOREIGN KEY (table3_id2) REFERENCES table3(table3_id),
FOREIGN KEY (table4_id1) REFERENCES table4(table4_id),
FOREIGN KEY (table4_id2) REFERENCES table4(table4_id)
);
Running this INSERT / SELECT below can cause the PostgreSQL to keep expanding in memory usage. On a 32-bit machine it aborts due to an out of memory error around 2GB, but on a 64-bit Linux machine it keeps using memory until all of the main memory and swap is full and then the oom-killer process is spawned and starts killing processes.
INSERT INTO test_target
(
table1_id, table2_id, visible,
date_added, date_updated,
table3_id1, table3_id2,
table4_id1, table4_id2
)
SELECT
0 as table1_id, 0 as table2_id,
TRUE as visible,
now() as date_added,
now() as date_updated,
0 as table3_id1, 0 as table3_id2,
0 as table4_id1, 0 as table4_id2
FROM generate_series(1, 13000000);
I've tried this on PostgreSQL 8.3.7 under Linux and I found the same behavior in PostgreSQL 8.4 Beta for Windows.
A simple workaround I found is to simply drop the foreign keys before the insert, and re-add the foreign keys after the data is loaded into the table.
Posted by Jory Stone on Saturday, May 30. 2009 at 18:34 in Software
no comments yet, be the first! Trackbacks (0)
no comments yet, be the first! Trackbacks (0)
Defined tags for this entry: postgresql, software
XChat Welcome Back Plugin 0.5 Win32
I compiled a version of the Welcome Back Plugin for XChat for Windows users.
Welcome Back Plugin 0.5 Win32 Build (07/25/2005, 4KB)
Homepage: http://tuxserve.sourceforge.net/wb.php
Welcome Back Plugin 0.5 Win32 Build (07/25/2005, 4KB)
Homepage: http://tuxserve.sourceforge.net/wb.php
Posted by Jory Stone on Monday, July 25. 2005 at 23:08 in Software
no comments yet, be the first! Trackbacks (0)
no comments yet, be the first! Trackbacks (0)
Defined tags for this entry: Software
Google Desktop Search
It sure has been a long time since I wrote an entry. A lot has happened, going to college, upgraded to a A64 2800+ system.
Google Desktop Search is one of the coolest things I've seen.
Simply install the tiny GDS program and let it index your hard drive. Then you can search for anything instantly.
The only downside someone might have is the index size, on my system with a 120GB and 80GB it totals about 1GB, which isn't bad IMHO.
I had written a plugin that allowed GDS to scan and index Matroska files a few days ago. It uses Haali's Matroska Parser library with HTML formated content for indexing. Today I packaged it up and submitted it to the official Google Desktop Search Plugin page.
GDCMatroska 1.0
Installer Package
Bare DLL
Google Desktop Search is one of the coolest things I've seen.
Simply install the tiny GDS program and let it index your hard drive. Then you can search for anything instantly.
The only downside someone might have is the index size, on my system with a 120GB and 80GB it totals about 1GB, which isn't bad IMHO.
I had written a plugin that allowed GDS to scan and index Matroska files a few days ago. It uses Haali's Matroska Parser library with HTML formated content for indexing. Today I packaged it up and submitted it to the official Google Desktop Search Plugin page.
GDCMatroska 1.0
Installer Package
Bare DLL
Posted by Jory Stone on Saturday, April 9. 2005 at 00:00 in Software
no comments yet, be the first! Trackbacks (0)
no comments yet, be the first! Trackbacks (0)
Defined tags for this entry: Software
MatroskaProp 2.8 Released
An updated version of the Matroska Shell Extension for Windows Explorer.
MatroskaProp 2.8
Installer
Source Code
MatroskaProp 2.8
Installer
Source Code
Posted by Jory Stone on Saturday, April 2. 2005 at 00:00 in Software
1 Comment Trackbacks (0)
1 Comment Trackbacks (0)
Defined tags for this entry: Software
foo_wmaenc v1.2 for Foobar2000
Here is a diskwriter plugin I wrote for foobar2000 v0.8.2 that writes WMA (Windows Media Audio) files.
foo_wmaenc v1.2
Binary
Source Code
foo_wmaenc v1.2
Binary
Source Code
Posted by Jory Stone on Wednesday, September 29. 2004 at 00:00 in Software
no comments yet, be the first! Trackbacks (0)
no comments yet, be the first! Trackbacks (0)
Defined tags for this entry: Software
FLAC ACM
CoreFLAC ACM v0.1
An ACM codec for FLAC. Using libflac 1.1.0. Compression level is set to the default -5 level. This codec is more or less a hack as the ACM API doesn't provide good support for VBR audio codecs.
flac2wav2flac v0.1
A command-line program to convert .wav files compression with the FLAC ACM to .flac and convert .flac to FLAC .wav's.
An ACM codec for FLAC. Using libflac 1.1.0. Compression level is set to the default -5 level. This codec is more or less a hack as the ACM API doesn't provide good support for VBR audio codecs.
flac2wav2flac v0.1
A command-line program to convert .wav files compression with the FLAC ACM to .flac and convert .flac to FLAC .wav's.
Posted by Jory Stone on Saturday, September 11. 2004 at 00:00 in Software
no comments yet, be the first! Trackbacks (0)
no comments yet, be the first! Trackbacks (0)
Defined tags for this entry: Software
Normalize2 v0.1 for AviSynth 2.5
Normalize2 : Audio Normalizing Plugin for AviSynth 2.5
The main difference between this plugin and the built-in normalizing plugin Normalize() is that this plugin can store the peak level value in an external file and uses a lookup table to do the actual normalizing, which is faster.
Normalize2 v0.1
Binary and Source Code
The main difference between this plugin and the built-in normalizing plugin Normalize() is that this plugin can store the peak level value in an external file and uses a lookup table to do the actual normalizing, which is faster.
Normalize2 v0.1
Binary and Source Code
Posted by Jory Stone on Monday, December 15. 2003 at 00:00 in Software
no comments yet, be the first! Trackbacks (0)
no comments yet, be the first! Trackbacks (0)
Defined tags for this entry: Software
(Page 1 of 1, totaling 7 entries)