Postgresql COPY and CSV data w/ double quotes


Example CSV line:

"2012","Test User","ABC","First","71.0","","","0","0","3","3","0","0","","0","","","","","0.1","","4.0","0.1","4.2","80.8","847"

All values after "First" are numeric columns. Lots of NULL values just quoted as such, right.

Attempt at COPY:

copy mytable from 'myfile.csv' with csv header quote '"';

NOPE: ERROR: invalid input syntax for type numeric: ""

Well, yeah. It's a null value. Attempt 2 at COPY:

copy mytable from 'myfile.csv' with csv header quote '"' null '""';

NOPE: ERROR: CSV quote character must not appear in the NULL specification

What's a fella to do? Strip out all double quotes from the file before running COPY? Can do that, but I figured there's a proper solution to what must be an incredibly common problem.

Best Solution

While some database products treat an empty string as a NULL value, the standard says that they are distinct, and PostgreSQL treats them as distinct.

It would be best if you could generate your CSV file with an unambiguous representation. While you could use sed or something to filter the file to good format, the other option would be to COPY the data in to a table where a text column could accept the empty strings, and then populate the target table. The NULLIF function may help with that: -- it will return NULL if both arguments match and the first value if they don't. So, something like NULLIF(txtcol, '')::numeric might work for you.