Table of Contents

Using Database Plugin

DatabasePlugin uses Perl's DBI module to import records from a database. DBI includes back-ends for mysql, postgresql, comma separated values (CSV), MS Excel, ODBC, sybase, etc. You will need to have the DBI module installed, as well as the appropriate back end module(s).

A dbi configuration file is needed, which specifies how to get records out of a database.

Greenstone3

See <GSDL3HOME>/gs2build/etc/packages/example.dbi for an example config file.

Assuming you have got all the necessary modules installed, then the basic way to use DBPlugin is:

  • Add DatabasePlugin to the list of plugins for your collection.
  • Copy <GSDL3HOME>/gs2build/etc/packages/example.dbi into the import directory of your collection.
  • Modify this file appropriately
  • You may want to have more than one copy of the file, for different database connections/queries. The name does not matter, but the file extension should be .dbi
  • Import and build the collection.

Greenstone2

See <GSDLHOME>/etc/packages/example.dbi for an example config file.

Assuming you have got all the necessary modules installed, then the basic way to use DBPlugin is:

  • Add DatabasePlugin to the list of plugins for your collection.
  • Copy <GSDLHOME>/etc/packages/example.dbi into the import directory of your collection.
  • Modify this file appropriately
  • You may want to have more than one copy of the file, for different database connections/queries. The name does not matter, but the file extension should be .dbi
  • Import and build the collection.

Access Processing

This describes my experience of getting DatabasePlugin to get records out of MS Access database.

 $db='DBI:ADO:Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\shaoqun\Greenstone\collect\testacc\TestAccess.mdb';
 $sql_query = 'SELECT * FROM Students';
 %db_to_greenstone_fields=(
 "Name" => "Title",
 "Address" => "text",
 "StudentsID" => "Identifier",
 );

This is a mapping between column names in the Students table, and metadata names in the Greenstone archive files.

Notes about Perl modules.

Note:I couldn't find the Win32::OLE module through searching Win32::OLE at http://search.cpan.org, so I downloaded libwin32-0.26.tar.gz

 perl Makefile.PL INSTALLSITELIB="%GSDLHOME%/perllib/cpan/perl-5.8" <br/> PREFIX="%GSDLHOME%/perllib/cpan/XXX" SITEPREFIX="%GSDLHOME%/perllib/cpan" 
 nmake 
 nmake test
 nmake install

This installs the modules into %GSDLHOME%/perllib/cpan/perl-5.8 (XXX in the perl line should be set to the first component of the module name, e.g. DBI,DBD,Win32)

Note: I added

 unshift(@INC, "$ENV{'GSDLHOME'}/perllib/cpan/perl-5.8");

in the BEGIN block of Makefile.PL When installing DBD::ADO as it complained not be able to find the DBI module which was installed in %GSDLHOME%/perllib/cpan/perl-5.8

I also added that line in the BEGIN block of DBPlug.pm When building the collection

MySQL Processing

This describes my experience of getting DBPlug to get records for a mysql database.

 $db='DBI:mysql:gswikidb:wesson.cs.waikato.ac.nz';
 
 $username='root';
 (I used //root// without a password to log in. You may need to set $password if authentication is required.)
  
 $sql_query = 'SELECT * FROM gw_user';
  
 %db_to_greenstone_fields=(
 "user_name" => "Title",
 "user_real_name" => "text",
 "user_id" => "Identifier",
 ); 
 (This is a mapping between field names in gw_user, and metadata names in the <br/> Greenstone archive files.)

Notes about Perl modules.

 perl Makefile.PL INSTALLSITELIB="$GSDLHOME/perllib/cpan/perl-5.8"  PREFIX="$GSDLHOME/perllib/cpan/XXX" SITEPREFIX="$GSDLHOME/perllib/cpan" 
 make 
 make test
 make install

*for DBD:mysql, run

 perl Makefile.PL --testdb=gswikidb --testhost=wesson.cs.waikato.ac.nz --testuser=root <br/> INSTALLSITELIB="$GSDLHOME/perllib/cpan/perl-5.8" <br/>  PREFIX="$GSDLHOME/perllib/cpan/XXX" SITEPREFIX="$GSDLHOME/perllib/cpan" 
 make 
 make test
 make install

This installs the modules into $GSDLHOME/perllib/cpan/perl-5.8

(XXX in the perl line should be set to the first component of the module name, e.g. DBI,DBD,Data)

CSV Processing

This describes my experience of getting DBPlug to process CSV (comma separated value) files using DBPlug.

 $db='DBI:CSV:f_dir=/research/kjdon/home/gsdl/collect/csvtest;csv_quote_char=\";csv_sep_char=,';

f_dir is the directory containing the csv file. If you want to use ; as a separator, then you need to escape it, e.g. csv_sep_char=\;

 $sql_query = 'SELECT * FROM demo.txt';
%db_to_greenstone_fields=(
    "name" => "Title",
    "data" => "text",
    "language" => "Language",
    "filename" => "Filename"
);

This is a mapping between field names in the CSV file, and metadata names in the Greenstone archive files.

 filename,name,language,data
 b17mie/b17mie.htm,Microlivestock - Little-Known Small Animals with a Promising Economic Future (b17mie),English,"Animal Husbandry and Animal Product Processing|Other animals (micro-livestock, little known animals, silkworms, reptiles, frogs, snails, game, etc.)"
 b18ase/b18ase.htm,Little Known Asian Animals With a Promising Economic Future (b18ase),English,"Animal Husbandry and Animal Product Processing|Other animals (micro-livestock, little known animals, silkworms, reptiles, frogs, snails, game, etc.)"

Notes about Perl modules.

My Linux distribution had the DBI module installed, but other needed modules were missing. I discovered which ones were needed by running the import: If a module is missing, you get an error like:

 install_driver(CSV) failed: Can't locate DBI/SQL/Nano.pm in @INC (@INC contains:.....) at 
 /research/kjdon/home/gsdl/perllib/cpan/perl-5.8/DBD/File.pm line 25.
 Compilation failed in require at /research/kjdon/home/gsdl/perllib/cpan/perl-5.8/DBD/CSV.pm line 26.
 Compilation failed in require at (eval 42) line 3.
 Perhaps a module that DBD::CSV requires hasn't been fully installed
 at /research/kjdon/home/gsdl/perllib/plugins/DBPlug.pm line 210

this message tells us that DBI/SQL/Nano.pm module is needed. I had to download and install the following modules:

To install these, I downloaded each from CPAN, (all tar files), and put them in $GSDLHOME/packages/cpan. I untarred them (tar xzvf file.tar.gz), and ran the following for each one: (make sure you have run 'setup.bat' or 'source setup.bash' in your greenstone directory first)

 perl Makefile.PL INSTALLSITELIB="$GSDLHOME/perllib/cpan/perl-5.8" PREFIX="$GSDLHOME/perllib/cpan/XXX" SITEPREFIX="$GSDLHOME/perllib/cpan"
 make
 make test
 make install

(XXX in the perl line should be set to the first component of the module name, e.g. DBD, DBI, SQL etc)

This installs the modules into $GSDLHOME/perllib/cpan/perl-5.8

Excel Processing

This describes my experience of getting DatabasePlugin to process MS excel files.

 $db='DBI:Excel:file=/research/shaoqun/testing/gsdl/collect/testexce/phonebook.xls';
 $sql_query = 'SELECT * FROM Sheet1';

Important: the perl excel driver module used in this testing assumes TABLE = Worksheet and the contents of first row of each worksheet as column name. (Sheet1 is the name of the first worksheet of phonebook.xls). Be aware that the worksheet name is case-sensitive.

%db_to_greenstone_fields=(
    "name" => "Title",
    "address" => "text",
    "id" => "Identifier",
);

This is a mapping between column names in the Sheet1 worksheet, and metadata names in the Greenstone archive files.

Notes about Perl modules.

 perl Makefile.PL INSTALLSITELIB="$GSDLHOME/perllib/cpan/perl-5.8" <br/> PREFIX="$GSDLHOME/perllib/cpan/XXX" SITEPREFIX="$GSDLHOME/perllib/cpan" 
 make 
 make test
 make install

This installs the modules into $GSDLHOME/perllib/cpan/perl-5.8

(XXX in the perl line should be set to the first component of the module name, e.g. DBI,DBD,SQL,etc)

Note:for more information about using DBD::Excel, please refer to the perl files in the sample directory included in the Module.