Excel Processing using DBPlug

This describes my experience of getting DBPlug to process MS excel files.

$db='DBI:Excel:file=/research/shaoqun/testing/gsdl/collect/testexce/phonebook.xls';
 * My test collection is called testexce installed in /research/shaoqun/testing/gsdl(on Linux).
 * My excel file was called phonebook.xls and was put in my collection's directory (i.e. /research/shaoqun/testing/gsdl/collect/testexce). This location doesn't matter. If it goes in the import directory then you will either get warnings about how no plugin could process the file, or it will be inappropriately processed by another plugin as well as DBPlug.
 * I copied /research/shaoqun/testing/gsdl/etc/packages/example.dbi into the import directory of my collection.
 * This file was modified by setting the following variables:

$sql_query = 'SELECT * FROM Sheet1'; Important: the perl excel driver module used in this testing assumes TABLE = Worksheet and the contents of first row of each worksheet as column name. (Sheet1 is the name of the first worksheet of phonebook.xls). Be aware that the worksheet name is case-sensitive. %db_to_greenstone_fields=(    "name" => "Title",     "address" => "text",     "id" => "Identifier", ); This is a mapping between column names in the Sheet1 worksheet, and metadata names in the Greenstone archive files.


 * Then I built the collection.

Notes about Perl modules. perl Makefile.PL INSTALLSITELIB="$GSDLHOME/perllib/cpan/perl-5.8" PREFIX="$GSDLHOME/perllib/cpan/XXX" SITEPREFIX="$GSDLHOME/perllib/cpan" make make test make install This installs the modules into $GSDLHOME/perllib/cpan/perl-5.8 (XXX in the perl line should be set to the first component of the module name, e.g. DBI,DBD,SQL,etc) Note:for more information about using DBD::Excel, please refer to the perl files in the sample directory included in the Module.
 * download following modules from CPAN DBI DBD::Excel Spreadsheet::ParseExcel Spreadsheet::WriteExcel SQL::Statement OLE::Store_Lite
 * install these modules: make sure you have run 'setup.bat' or 'source setup.bash' in your greenstone directory first)
 * put them in $GSDLHOME/packages/cpan.
 * untarred them (tar xzvf file.tar.gz)
 * run