Sed: Convert CSV to PostgreSQL

From FVue
Jump to: navigation, search

Problem

I want to convert a CSV (Comma Separated Values) file to a database input file, e.g. PostgreSQL.

Solution 1. Sed script

This sed-script `csv2psql.sed' should do the job:

#!/usr/bin/sed -f
#--- csv2psql.sed ------------------------------------------------------
# Convert a CSV (Comma Separated Values) file to a PostgreSQL input file
# Usage:  sed -f csv2psql.sed myfile.csv | psql -d mydb
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2, or (at your option)
# any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# The latest version of this software can be obtained here:
# http://fvue.nl/Sed:_Convert_CSV_to_PostgreSQL

    # Convert end-of-line CR/LF (MS-DOS) to LF (Unix)
s/\r//g
    # On first line...
1 {
        # Output commands to initialize database
    i\DELETE FROM mytable;
        # Lowercase column names
    y/ABCDEFGHIJKLMNOPQRSTUVWXYZ/abcdefghijklmnopqrstuvwxyz/
        # Replace whitepace with comma-space ', '
    s/[[:space:]]\+/, /g
        # Wrap column names within database command
    s/^.*$/COPY mytable(&) FROM STDIN;/
}
    # After last line, output end-of-data marker: backslash-period '\.'
$ a\\\.

Solution 2. Sed script embedded within bash

The sed-script can also be embedded in bash using the here-document syntax. The only difference is, you need to take care of escaping backslashes `\' if you want them to reach sed.

NOTE: Escaping the carriage-return `\r' isn't necessary because bash translates `\r' into a real carriage-return which sed understands as well.

#!/bin/bash
# Convert csv file to PostgreSQL file and import it
function import_csv() {
    # Generate sql file
    cat <<SED | sed --file=- myfile.csv > myfile.sql
    # Convert end-of-line CR/LF (MS-DOS) to LF (Unix)
    s/\r//g
    # On first line...
    1 {
        # Start transation
        i\BEGIN;
        # Output commands to initialize database
        i\DELETE FROM mytable;
        # Lowercase column names
        y/ABCDEFGHIJKLMNOPQRSTUVWXYZ/abcdefghijklmnopqrstuvwxyz/
        # Replace whitespace with comma-space ', '
        s/[[:space:]]\+/, /g
        # Wrap column names within database command
        s/^.*$/COPY mytable(&) FROM STDIN;/
    }
    # On last line...
    $ {
        # Output end-of-data marker: backslash-period '\.'
        a\\\\\.
        # Commit transaction
        a\COMMIT;
    }
SED
    psql mydb -A -f myfile.sql -q -t -v ON_ERROR_STOP=
} # import_csv()

Comments

blog comments powered by Disqus