Table of Contents

Editing archives metadata and/or fulltext using a MySQL database

Like the GreenstoneMETSPlugin, the GreenstoneSQLPlugin is another alternative to using the GreenstoneXMLPlugin.

You can choose to use the GreenstoneSQLPlugout and GreenstoneSQLPlugin (they work together) in place of the GreenstoneXMLPlugin and the default plugout GreenstoneXMLPlugout. In that case, instead of metadata or fulltext being written out to doc.xml files in the archives directory, your collection's metadata and/or fulltext will be stored in your MySQL database. Whatever you didn't specify as going into the SQL database (fulltext or metadata) will stay in doc.xml files in the archives folder, now called docsql.xml files.

It has the advantage that you have the power of SQL statements to mass-edit your collection metadata for instance.

Important Notes:

You can check you have the DBI and DBD::mysql packages installed as part of your perl in whatever OS you're working on. To test you have DBI and DBD::mysql installed and also find their versions, run the following command in perl:

perl -MDBI -e "DBI->installed_versions";
  Perl            : 5.022001	(x86_64-linux-gnu-thread-multi)
  OS              : linux	(3.16.0)
  DBI             : 1.634  <-------------- IMPORTANT VERSION NUMBER
  DBD::mysql      : 4.033  <-------------- IMPORTANT VERSION NUMBER
  ...

The short version: Usage instructions

Notes on GreenstoneSQLPlugin/out configuration options:

Detailed version: Using the GreenstoneSQLPlugs

Prerequisites

Notes

Configuring your collection to use the GreenstoneSQLPlugs

When you create a collection for which you want to use the GreenstoneSQLPlugs (GreenstoneSQLPlugin and GreenstoneSQLPlugout), you need to set up these plugins and configure them.

Take note that the GreenstoneSQLPlugin is to replace the GreenstoneXMLPlugin, and that the GreenstoneSQLPlugout is to replace the silent default of GreenstoneXMLPlugout.

As with all Greenstone plugins and plugouts, if you prefer to use the default values for certain options (all but db_client_pwd and db_port have default values), you can leave out specifying those options and their values.

If you're configuring the plugs by editing collectionConfig.xml, remember the file can now be edited through GLI's Edit menu too, besides the usual way of using a text editor.

Configuration options

The configuration options for both the GreenstoneSQLPlugin and GreenstoneSQLPlugout are the same and need to be provided the same values for both.

GreenstoneSQLPlugin

GreenstoneSQLPlugout

There are multiple ways in which you can set up your GreenstoneSQLPlugout configuration:

a. Possibly the best way is to edit the collection's collectionConfig.xml file to add the <plugout> element along with its options.

Note that you need to set the name of the plugout to the full name, GreenstoneSQLPlugout.

    <import>    
	<plugout name="GreenstoneSQLPlugout">
	    <option name="-process_mode" value="all"/>
	    <option name="-rollback_on_cancel" value="false"/>
	    <option name="-db_driver" value="mysql"/>
	    <option name="-db_client_user" value="root"/>
	    <option name="-db_client_pwd" value="TYPE_YOUR_MYSQL_PASSWORD_HERE"/>
	    <option name="-db_host" value="127.0.0.1"/>
	    [<option name="-db_port" value="TYPE_MYSQL_SERVER_PORT_NUMBER"/>] <!-- optional. 
						Leave out if using 'default' DB port -->
	</plugout>
	<pluginList>
	  <plugin name="ZIPPlugin"/>
	  <plugin name="GreenstoneSQLPlugin">
		<option name="-process_mode" value="all"/>
		<option name="-rollback_on_cancel" value="false"/>
		<option name="-db_driver" value="mysql"/>
		<option name="-db_client_user" value="root"/>
		<option name="-db_client_pwd" value="TYPE_YOUR_MYSQL_PASSWORD_HERE"/>
		<option name="-db_host" value="127.0.0.1"/>
		[<option name="-db_port" value="TYPE_MYSQL_SERVER_PORT_NUMBER"/>] <!-- optional. 
						Leave out if using 'default' DB port -->
	  </plugin>
	   ...
	<pluginList>
    </import>
    ...

Then run (incremental) import and buildcol normally, like:

(incremental-)import.pl -site <SITENAME> <COLNAME>
(incremental-)buildcol.pl -activate -site <SITENAME> <COLNAME>

Leave out -site <SITENAME> for Greenstone 2, whereas for Greenstone 3, the default sitename is localsite.

You can now preview your Greenstone collection, run the MySQL client (but after logging in, immediately set the client's database connection to use the utf8mb4 character set to access the database: set names utf8mb4;), then use SQL statements to modify the database contents such as a collection's metadata and finally rebuild and preview your Greenstone collection once more.

b. GLI-SPECIFIC WAY.

In GLI, go to Create > Import Options

Note that the saveas option's value in GLI's import options is GreenstoneSQL.

c. Full command line version:

import.pl -site localsite -verbosity 3 -saveas GreenstoneSQL -saveas_options "-process_mode all -db_driver mysql -db_host 127.0.0.1 -db_client_user root -db_client_pwd TYPE_THE_PASSWORD_HERE" COLNAME

Note that the saveas option's value in the full command line version is GreenstoneSQL too.

Background: How it works

In general, building a Greenstone collection consists of two phases: the import phase and the buildcol phase. You would have noticed this if you ran the building process manually on the commandline. When using the GreenstoneSQLPlugs, the following happens when building a Greenstone collection:

You can now preview your built collection as usual, and you can search, browse and view your documents as usual. If you were working on the command line, then you'd either restart your Greenstone server before previewing, or if your collection was re-activated by passing in -activate when running buildcol.pl on the command line, you can just refresh your digital library pages in the browser.

Getting and running MySQL

Newer versions of Mac OS (we tested on MacOS v 10.13/High Sierra) require Building the DBD::mysql package first, so start by going through those instructions. Then proceed from step 2 below.

For Windows and Linux, get the OS appropriate binary zip/tarball of mysql-5.7.23 from https://dev.mysql.com/downloads/mirrors/ (e.g. http://mysql.inspire.net.nz/Downloads/MySQL-5.7/).

We have not tested the Greenstone SQL Plugs against versions of MySQL other than 5.7.23. These instructions for installing and running the MySQL server and client are likewise based on our attempts at using MySQL v 5.7.23. For other versions, refer to MySQL's online documentation.

Installing / One time setup

0. Check no mysql already installed, else there may be conflicts.

1. Download and installation

2. Initialisation

a. ON UNIX (Linux and Mac tested)

b. On UNIX AND WINDOWS, create new EMPTY datadir folder:

mkdir /home/me/mysql-5.7.23/data

c. Initialise the server.

Username root is now created for the mysql server that will run on localhost and it prints an initial password. Make note of this password. For example, if the server was initialised on Linux:

  mymachine:[143]/home/me/mysql-5.7.23> \
    ./bin/mysqld --datadir /home/me/mysql-5.7.23/data --lc-messages-dir /home/me/mysql-5.7.23/errmsgs --initialize
  2018-10-03T05:06:14.425489Z 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. \ 
    Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
  2018-10-03T05:06:14.425542Z 0 [Warning] Using pre 5.5 semantics to load error messages from \
    /home/me/mysql-5.7.23/errmsgs/.
  2018-10-03T05:06:14.425545Z 0 [Warning] If this is not intended, refer to the documentation \
    for valid usage of --lc-messages-dir and --language parameters.
  2018-10-03T05:06:14.748762Z 0 [Warning] InnoDB: New log files created, LSN=45790
  2018-10-03T05:06:14.822112Z 0 [Warning] InnoDB: Creating foreign key constraint system tables.
  2018-10-03T05:06:14.889567Z 0 [Warning] No existing UUID has been found, so we assume that \
    this is the first time that this server has been started. \ 
    Generating a new UUID: 0d9d22c1-c6ca-11e8-8224-484d7ec57ff3.
  2018-10-03T05:06:14.891586Z 0 [Warning] Gtid table is not ready to be used. \
    Table 'mysql.gtid_executed' cannot be opened.
  2018-10-03T05:06:14.893456Z 1 [Note] A temporary password is generated for \
    root@localhost: ryfi_%BUl7Gz <---------- MAKE NOTE OF YOUR PASSWORD APPEARING AT THIS LOCATION

3. Run the MySQL server (passing in –console is necessary on Windows to see any error messages, passing in –lc-messages-dir errmsgs is necessary for Linux):

./bin/mysqld \
--datadir /home/me/mysql-5.7.23/data \
[--console] \
[--lc-messages-dir /home/me/mysql-5.7.23/errmsgs]

4. Change root password ( from here).

Assuming the mysql server is currently running with ./mysqld[_safe] --datadir=/home/me/mysql-5.7.23/data [--console],

Use a terminal to login to the MySQL client:

./mysql -u root -p

You'll be prompted for a password. Type the password from step 3c.

Then at the MySQL client prompt, type the following commands, replacing new_password below with the new password of your choice (but leave the single quotes around it intact):

ALTER USER 'root'@'localhost' IDENTIFIED BY 'new_password';
flush privileges;
quit;

5. Before using the GreenstoneSQLPlugs:
a. Quit the MySQL server as explained here, so you can relaunch it with the necessary character set as explained under Running the MySQL SERVER
b. When connecting the MySQL client next time, make sure to set the correct character encoding needed for Greenstone, as explained under Running the MySQL CLIENT.

Running

Running the MySQL server

When launching the MySQL server

On Windows:

	C:\Users\Me\mysql-5.7.24-winx64> \
	  bin\mysqld \
	    --datadir=C:\Users\Me\mysql-5.7.24-winx64\data \
	    --character_set_server=utf8mb4 \
	    --console

Note the 'd' at the end of "mysql".

On UNIX, run mysqld_safe rather than mysqld this (but again, note the 'd' at the end of "mysql"):

	mymachine:[115]/home/me/mysql-5.7.23/bin> \
	  ./mysqld_safe --datadir=/home/me/mysql-5.7.23/data \
	  --character_set_server=utf8mb4

Running the MySQL client

After running the MySQL server as above, can run the client with -u root as follows, to make it prompt for your password. Immediately after logging in with the MySQL client, turn on utf8mb4 character set support before doing anything with the database:

  ./home/me/mysql-5.7.23/bin/mysql -u root -p 
  
  Enter password: [Type your password]
  
  mysql> set names utf8mb4;

Note for Windows users: If you're running the command line client and want to view the contents in utf8, there's some support for unicode display in DOS but with provisos. To set up your DOS for this unicode support:

  1. Right click on the title bar of your DOS prompt, select Properties > Font tab. Then select "Lucida Console" as the font, which can display a range of unicode characters. Then click OK.
  2. Change the codepage of the DOS console to support unicode by typing the following at the DOS prompt before running the MySQL client: chcp 65001

For the provisos on the extent of unicode support at the Windows DOS prompt and further information, consult https://stackoverflow.com/questions/388490/how-to-use-unicode-characters-in-windows-command-line

Quitting

And then type root user's password when prompted.

If at any time that doesn't work, the next option is to try

Building the DBD::mysql package for Mac OS v 10.13 ("High Sierra")

Automated way

The automated way should suffice if you're on a MacOS v 10.13/High Sierra machine. The script needs internet access.

Open a terminal and

for GS3:

cd /your/GS3/gs2build/perllib/cpan

for GS2 (untested):

cd /your/GS2/perllib/cpan

Ensure the compile-dbd-mysql.sh script has execute permissions:

chmod u+x compile-dbd-mysql.sh

(The script has now been modified to work on Linux too, besides Mac. And it has further been updated to support compiling up DBI besides DBD. If you want DBI compiled up too, then use a text editor to open up the script compile-dbd-mysql.sh. Locate the line that says:

NEED_PERL_DBI=false # set to true if your machine doesn't have DBI

Change the value to true:

NEED_PERL_DBI=true # set to true if your machine doesn't have DBI

Save the file and close the text editor before proceeding.)

Now run it:

./compile-dbd-mysql.sh

The above script will: 1. download and install mysql v 5.7.23 inside your GSDLHOME (the GS3/gs2build folder for GS3 users) 2. download and compile DBD::mysql v 4.033 against the above mysql installation's libraries and header files

Once the script has finished running

Manual way

These instructions are for installing the perl package DBD::mysql version 4.033 on MacOS 10.13/High Sierra.

The automatic script above takes cares of all this. But if there are any issues with that, or if you want to attempt the same process of building the DBD::mysql package with different versions of either perl, DBD::mysql or mysql itself, then these steps may be useful.

1. Get the DBD::mysql source package. Since on Windows we worked with DBD::mysql version 4.033, and we've now successfully built and tested that version on a High Sierra Mac using the Greenstone Plugs, we recommend the same version:

2. The remaining instructions follow select steps from https://metacpan.org/pod/distribution/DBD-mysql/lib/DBD/mysql/INSTALL.pod#BINARY-INSTALLATION

First create a directory called mysql-static and copy all the *.a files in your MySQL installation's lib subfolder into it:

mkdir /tmp/mysql-static
cp /Users/me/mysql-5.7.23/lib/*.a mysql-static/.

Note that if your Mac came with a MySQL pre-installed, you'd be copying from the system installed mysql lib location, which would have been installed with a slightly different folder structure:

cp /usr/lib/mysql/*.a /tmp/mysql-static/.

3. Now you're finally ready to compile up the downloaded DBD::MySQL package. Here we continue to assume that your that MySQL binary is installed at /Users/me/mysql-5.7.23

a. Create a temporary folder where your DBD::mysql will temporarily end up installed.
I created it inside my Greenstone binary at $GSDLHOME/perllib/cpan/tmp_mysql

Wherever you created this folder, you will use the full path to it further below.

b. Go into your untarred DBD::mysql v 4.033 folder: cd /Users/me/DBD-mysql-4.033

c. Run the following 3 commands after first editing - PREFIX to be the full path to the temporary folder you created in step 3a, and - the full paths to your MySQL 5.7.23 installation used in --cflags and --libs below

perl Makefile.PL \
    PREFIX=$GSDLHOME/perllib/cpan/tmp_mysql \
    --cflags=-I/Users/me/mysql-5.7.23/include \
    --libs="-L/Users/me/mysql-5.7.23/mysql-static -lmysqlclient"

make

make install

4. Finish up installing.

a. You need to have a Greenstone installation for this step. Source the greenstone setup script (gs3-setup.sh for GS3 and setup.bash for GS2) to set up the Greenstone environment, including the GSDLHOME environment variable.

b. Step 3 above should have generated folders and files in the temporary folder you created in step 3a. Go into that temporary folder and move 2 particular subfolders into your $GSDLHOME/perllib/cpan location as follows:

cd $GSDLHOME/perllib/cpan/tmp_mysql

mv lib/perl5/site_perl/5.18.2/darwin-thread-multi-2level/DBD gs2build/cpan/perl-5.18/.
mv lib/perl5/site_perl/5.18.2/darwin-thread-multi-2level/auto/DBD gs2build/perl-5.18/auto/.

c. Optional: you can check that the compiled up DBD::MySQL package was statically linked against the mysql client with the following command:

otool -L  $GSDLHOME/perllib/cpan/auto/DBD/mysql/mysql.bundle

The following output indicates that the only dynamic library that the generated DBD::mysql bundle file references is a Mac system library. Since there's no mention of the libmysqlclient.dylib, we can feel confident that it was statically linked into our bundle file:

auto/DBD/mysql/mysql.bundle:
	/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1226.10.1)

5. Test that DBD::MySQL is now installed. In a new terminal, first set up the PERL5LIB environment variable to point to the cpan folders where you installed the DBD::mysql products in step 4. Then you can check whether your DBD::mysql is now detected:

export PERL5LIB=$GSDLHOME/perllib/cpan:$GSDLHOME/perllib/cpan/perl-5.18:$PERL5LIB
perl -MDBI -e "DBI->installed_versions";

(Since High Sierra macs come with perl 5.18 pre-installed, the perl binary packages that we created for this went into cpan/perl-5.18)

You should now see DBD::mysql listed among the output. For example, in my case:

  Perl            : 5.018002    (darwin-thread-multi-2level)
  OS              : darwin      (17.0)
  DBI             : 1.631
  DBD::mysql      : 4.033 <---------------------- SUCCESS!
  DBD::Sponge     : 12.010003
  DBD::SQLite     : 1.40
  DBD::Proxy      : install_driver(Proxy) failed: Can't locate RPC/PlClient.pm in @INC (you may need to install the RPC::PlClient module)
  DBD::Gofer      : 0.015327
  DBD::File       : 0.42
  DBD::ExampleP   : 12.014311
  DBD::DBM        : 0.08

6. Clean up the temporary folders we no longer need:

rm -rf $GSDLHOME/perllib/cpan/tmp_mysql
rm -rf /tmp/mysql-static

7. Follow the MySQL installation instructions from step 2 onwards to finish setting up the MySQL server to work with Greenstone on MacOS High Sierra.

Once that's done, you can run the MySQL server and finally start using the Greenstone SQL Plugs in a new collection by editing its collection config file.

TODO: