This page is in the 'old' namespace, and was imported from our previous wiki. We recommend checking for more up-to-date information using the search box.

CSV Processing Using DatabasePlugin

This describes my experience of getting DBPlug to process CSV (comma separated value) files using DBPlug.

 $db='DBI:CSV:f_dir=/research/kjdon/home/gsdl/collect/csvtest;csv_quote_char=\";csv_sep_char=,';

f_dir is the directory containing the csv file. If you want to use ; as a separator, then you need to escape it, e.g. csv_sep_char=\;

 $sql_query = 'SELECT * FROM demo.txt';
%db_to_greenstone_fields=(
    "name" => "Title",
    "data" => "text",
    "language" => "Language",
    "filename" => "Filename"
);

This is a mapping between field names in the CSV file, and metadata names in the Greenstone archive files.

 filename,name,language,data
 b17mie/b17mie.htm,Microlivestock - Little-Known Small Animals with a Promising Economic Future (b17mie),English,"Animal Husbandry and Animal Product Processing|Other animals (micro-livestock, little known animals, silkworms, reptiles, frogs, snails, game, etc.)"
 b18ase/b18ase.htm,Little Known Asian Animals With a Promising Economic Future (b18ase),English,"Animal Husbandry and Animal Product Processing|Other animals (micro-livestock, little known animals, silkworms, reptiles, frogs, snails, game, etc.)"

Notes about Perl modules.

My Linux distribution had the DBI module installed, but other needed modules were missing. I discovered which ones were needed by running the import: If a module is missing, you get an error like:

 install_driver(CSV) failed: Can't locate DBI/SQL/Nano.pm in @INC (@INC contains:.....) at 
 /research/kjdon/home/gsdl/perllib/cpan/perl-5.8/DBD/File.pm line 25.
 Compilation failed in require at /research/kjdon/home/gsdl/perllib/cpan/perl-5.8/DBD/CSV.pm line 26.
 Compilation failed in require at (eval 42) line 3.
 Perhaps a module that DBD::CSV requires hasn't been fully installed
 at /research/kjdon/home/gsdl/perllib/plugins/DBPlug.pm line 210

this message tells us that DBI/SQL/Nano.pm module is needed. I had to download and install the following modules:

To install these, I downloaded each from CPAN, (all tar files), and put them in $GSDLHOME/packages/cpan. I untarred them (tar xzvf file.tar.gz), and ran the following for each one: (make sure you have run 'setup.bat' or 'source setup.bash' in your greenstone directory first)

 perl Makefile.PL INSTALLSITELIB="$GSDLHOME/perllib/cpan/perl-5.8" PREFIX="$GSDLHOME/perllib/cpan/XXX" SITEPREFIX="$GSDLHOME/perllib/cpan"
 make
 make test
 make install

(XXX in the perl line should be set to the first component of the module name, e.g. DBD, DBI, SQL etc)

This installs the modules into $GSDLHOME/perllib/cpan/perl-5.8