Editing archives metadata and/or fulltext using a MySQL database

Like the GreenstoneMETSPlugin, the GreenstoneSQLPlugin is another alternative to using the GreenstoneXMLPlugin.

You can choose to use the GreenstoneSQLPlugout and GreenstoneSQLPlugin (they work together) in place of the GreenstoneXMLPlugin and the default plugout GreenstoneXMLPlugout. In that case, instead of metadata or fulltext being written out to doc.xml files in the archives directory, your collection's metadata and/or fulltext will be stored in your MySQL database. Whatever you didn't specify as going into the SQL database (fulltext or metadata) will stay in doc.xml files in the archives folder, now called docsql.xml files.

It has the advantage that you have the power of SQL statements to mass-edit your collection metadata for instance.

Important Notes:

  • None of this has been tested with Greenstone 2, only Greenstone 3. But it should in theory work with Greenstone 2 as well. Greenstone 2 users are invited to email the mailing list if there are any issues.
  • GreenstoneSQLPlugin and GreenstoneSQLPlugout rely on the DBI and DBD::mysql perl packages.
  • We've so far tested the GreenstoneSQLPlugs on Ubuntu Linux, Windows 7 64 bit and MacOS v.10.13/High Sierra with mysql version 5.7.23 and perl DBI version 1.634 and perl DBD::mysql version 4.033. We found DBD::mysql version 4.033 to have the necessary support for UTF8, whereas somewhat earlier versions didn't. (For this reason we're upgrading the version of Strawberry Perl from 5.18 to 5.22 which we include with Windows Greenstone binaries here onward, starting with GS3.09)
    • Compatible versions of DBI and DBD::mysql appear to come pre-installed on current releases of Ubuntu Linux (16.04).
    • For Windows binaries, we provide you with a Strawberry Perl that has the correct versions of DBI and DBD::mysql. If you're working with Greenstone source distributions or source code from svn on Windows, you can grab Strawberry Perl 5.22 from here, which includes versions of these packages that we tested the Greenstone SQL Plugs successfully against.
    • For newer versions of Mac, your pre-installed DBI package may be fine, but you may be missing DBD::mysql. Or perhaps you have an older version of DBD::mysql. In that case, refer to the Mac instructions at Getting and running MySQL on how to get DBD::mysql on Mac 10.13/High Sierra.

You can check you have the DBI and DBD::mysql packages installed as part of your perl in whatever OS you're working on. To test you have DBI and DBD::mysql installed and also find their versions, run the following command in perl:

perl -MDBI -e "DBI->installed_versions";
  Perl            : 5.022001	(x86_64-linux-gnu-thread-multi)
  OS              : linux	(3.16.0)
  DBI             : 1.634  <-------------- IMPORTANT VERSION NUMBER
  DBD::mysql      : 4.033  <-------------- IMPORTANT VERSION NUMBER
  ...

The short version: Usage instructions

  • At present there's only support for a MySQL database. There's instructions on obtaining and running MySQL if you need it.
  • Make sure you're running your MySQL server and have moreover launched it with the character set encoding set to utf8mb4:
    ./mysqld_safe --datadir=/Path/to/mysql/data --character_set_server=utf8mb4
  • After creating a new Greenstone collection, open up collectionConfig.xml for editing, as in the example snippet below
    • add in an element for the GreenstoneSQLPlugout
    • replace the GreenstoneXMLPlugin element with one for GreenstoneSQLPlugin. (It should notably take GreenstoneXMLPlugin's place near the top)
    • The configure options for both GreenstoneSQLPlugs are the same and need to be set consistently for both.
      Snippet of example configuration in collectionConfig.xml:
          <import>
              
      	<plugout name="GreenstoneSQLPlugout"> <!-- inserted outside pluginList element! -->
      	    <option name="-process_mode" value="all"/> <!-- or: meta_only / text_only -->
      	    <option name="-rollback_on_cancel" value="false"/>
      	    <option name="-db_driver" value="mysql"/>
      	    <option name="-db_client_user" value="root"/>
      	    <option name="-db_client_pwd" value="6reenstone3"/>
      	    <option name="-db_host" value="127.0.0.1"/>
      	    [<option name="-db_port" value="PORT_NUMBER"/>]
      	</plugout>
      
      	<pluginList>
      	  <plugin name="ZIPPlugin"/>
      
      	  <plugin name="GreenstoneSQLPlugin"> <!-- replaces GreenstoneXMLPlugin -->
      		<option name="-process_mode" value="all"/> <!-- or: meta_only / text_only -->
      		<option name="-rollback_on_cancel" value="false"/>
      		<option name="-db_driver" value="mysql"/>
      		<option name="-db_client_user" value="root"/>
      		<option name="-db_client_pwd" value="TYPE_THE_PASSWORD_HERE"/>
      		<option name="-db_host" value="127.0.0.1"/>
      		[<option name="-db_port" value="TYPE_MYSQL_SERVER_PORT_NUMBER"/>] <!--Optional-->
      	  </plugin>
      
      	   ...
      	<pluginList>
          </import>
          ...


      Optional options are in [square brackets].

  • Run (incremental-)import and (incremental-)build normally:
    (incremental-)import.pl -site <SITENAME> <COLNAME>
    (incremental-)buildcol.pl -activate -site <SITENAME> <COLNAME>


    IMPORTANT: On Windows, precede the import and buildcol commands with perl -S.
    The default sitename is localsite. Leave out the -site <SITENAME> for Greenstone 2.

  • Start your GS3 server if you haven't already and preview your collection.
  • You can now at any time run your MySQL client in utf8mb4 mode then use it to access modify the contents of the SQL database as you wish (such as using SQL statements to mass-edit metadata) and then rebuild the collection with the changed values in effect:
    • But once you log into MySQL client, always first set the connection to use the utf8mb4 character set before creating or loading databases and tables:
      mysql> set names utf8mb4;
    • The GreenstoneSQLPlugs create a database called <SITENAME> for GS3 and called greenstone2 for GS2.
    • Up to 2 tables are created for each collection: <COLNAME>_metadata and <COLNAME>_fulltxt (note spelling), where hyphens '-' in <COLNAME> are replaced by underscores '_'.

Notes on GreenstoneSQLPlugin/out configuration options:

  • Set process_mode to one of meta_only, text_only or all, depending on whether you want only metadata, only full text or both to be stored in your MySQL database for each document in your collection. The remainder will be stored in the docsql.xml file per document in your collection's archives subfolder.
  • The values for -db_driver, -db_client_user, -db_client_pwd, -db_host and the optional -db_port are the connection parameters you use when running the MySQL client against the running MySQL server.
    - Where there are default values for options, the defaults are set for the value attribute in the example below. Adjust the options' values as appropriate for you.
  • Experimental feature: If you set rollback_on_cancel to true, then if you cancel building during the import.pl or first phase of building, the database would remain unchanged since the start of that script. It has no real effect during the latter phase of building, which runs buildcol.pl, since buildcol.pl does not attempt to change the database, merely reading content back from MySQL. When running with the rollback_on_cancel option enabled, you will be reminded that to keep your file system in sync with database changes, you would need to manually backup the collection's archives and/or index folders.

Detailed version: Using the GreenstoneSQLPlugs

Prerequisites

  • You need to have MySQL 5.5 or greater installed (for utf8mb4 support). We tested with mysql version 5.7.23. See instructions on obtaining and running MySQL.
  • Your perl must have the DBI package, as discussed at the top of this page.
  • Before you build a collection using these plugins, your MySQL server needs to be running. Moreover, it needs to have been launched with the character set encoding set to utf8mb4:
    ./mysqld_safe --datadir=/Path/to/mysql/data --character_set_server=utf8mb4

Notes

  • utf8 (utf8mb4) encoding: Greenstone's doc.xml files store all metadata and fulltext in utf8 encoding. In keeping with that, the metadata and fulltext sent to the SQL database (replacing doc.xml files) are in utf8 too. That means that the database and tables that the GreenstoneSQLPlugs create and use are in utf8. In the case of MySQL, the encoding is specifically utf8mb4 and the GreenstoneSQLPlugs connect to the database with this encoding. This is also why you need to ensure that utf8mb4 is likewise the character encoding when you run your MySQL database server.
  • database names: Greenstone 3 creates a database for every GS3 site. The default site in GS3 is localsite. For GS3, the name of the database is the name of the site, so localsite in default cases. For GS2, there's only one database and it's called greenstone2.
  • table names: Depending on whether you chose to output just the metadata, just the fulltext or both to an SQL database, up to 2 tables will be created for each collection, named <collection_name>_metadata and <collection_name>_fulltxt (note the spelling).
  • Where a database name or collection name contains hyphens, these are replaced for a MySQL database with underscores '_' when constructing the database/table names, as MySQL doesn't allow hyphens in its names.

Configuring your collection to use the GreenstoneSQLPlugs

When you create a collection for which you want to use the GreenstoneSQLPlugs (GreenstoneSQLPlugin and GreenstoneSQLPlugout), you need to set up these plugins and configure them.

Take note that the GreenstoneSQLPlugin is to replace the GreenstoneXMLPlugin, and that the GreenstoneSQLPlugout is to replace the silent default of GreenstoneXMLPlugout.

As with all Greenstone plugins and plugouts, if you prefer to use the default values for certain options (all but db_client_pwd and db_port have default values), you can leave out specifying those options and their values.

If you're configuring the plugs by editing collectionConfig.xml, remember the file can now be edited through GLI's Edit menu too, besides the usual way of using a text editor.

Configuration options

The configuration options for both the GreenstoneSQLPlugin and GreenstoneSQLPlugout are the same and need to be provided the same values for both.

  • Provide the connection parameters that you would use when you manually connect your mysql client to your running MySQL server. The available choices for the db_driver option is for now only mysql. The other default connection parameters values are 127.0.0.1 for db_host and root username for db_client_user. If you connect to your MySQL server using a password for the specified username, provide that password in the db_client_pwd field. If your MySQL server runs on a specific port that is not the default, then provide that port number in the db_port field.
  • Set the process_mode option, choosing from meta_only, text_only, all (the default), depending on whether you want just the metadata, just the fulltext, or both the metadata and fulltext of each document stored in your SQL database. The remainder will then be stored in the docsql.xml file per document in your collection's archives subfolder.
  • Set the rollback_on_cancel option, which defaults to false, to true instead if you want your SQL database contents to be returned to pre-build state if you ever cancel during import.pl. If this option is set to false, then cancelling during collection building will have committed all meta/text contents up to that point to your SQL database.
    If you set rollback_on_cancel to true and want your filesystem (the archives and index directories) to remain in a state consistent with your SQL database, then you will be reminded to manually make a backup of your collection's archives and index folders before building, so that if you ever cancel during build, you can replace the cancelled archives and index folders with their backups.

GreenstoneSQLPlugin

  • If you're using GLI to configure the GreenstoneSQLPlugin, go to the Design panel's Document Plugins section and replace the GreenstoneXMLPlugin with the GreenstoneSQLPlugin at the same position. Configure the plugin options as required.
  • If you're configuring the GreenstoneSQLPlugin by editing collectionConfig.xml, then remove the <GreenstoneXMLPlugin> element and replace it with the <GreenstoneSQLPlugin> as follows, filling in your values for the plugin options:
    	  <plugin name="GreenstoneSQLPlugin">
    		<option name="-process_mode" value="all"/>
    		<option name="-rollback_on_cancel" value="false"/>
    		<option name="-db_driver" value="mysql"/>
    		<option name="-db_client_user" value="root"/>
    		<option name="-db_client_pwd" value="TYPE_PASSWORD_HERE"/>
    		<option name="-db_host" value="127.0.0.1"/>
    		[<option name="-db_port" value="PORT_NUMBER"/>] // don't need to provide this element if using 'default' DB port
    	  </plugin>

GreenstoneSQLPlugout

There are multiple ways in which you can set up your GreenstoneSQLPlugout configuration:

a. Possibly the best way is to edit the collection's collectionConfig.xml file to add the <plugout> element along with its options.

Note that you need to set the name of the plugout to the full name, GreenstoneSQLPlugout.

    <import>    
	<plugout name="GreenstoneSQLPlugout">
	    <option name="-process_mode" value="all"/>
	    <option name="-rollback_on_cancel" value="false"/>
	    <option name="-db_driver" value="mysql"/>
	    <option name="-db_client_user" value="root"/>
	    <option name="-db_client_pwd" value="6reenstone3"/>
	    <option name="-db_host" value="127.0.0.1"/>
	    [<option name="-db_port" value="TYPE_MYSQL_SERVER_PORT_NUMBER"/>] <!-- optional. 
						Leave out if using 'default' DB port -->
	</plugout>
	<pluginList>
	  <plugin name="ZIPPlugin"/>
	  <plugin name="GreenstoneSQLPlugin">
		<option name="-process_mode" value="all"/>
		<option name="-rollback_on_cancel" value="false"/>
		<option name="-db_driver" value="mysql"/>
		<option name="-db_client_user" value="root"/>
		<option name="-db_client_pwd" value="TYPE_THE_PASSWORD_HERE"/>
		<option name="-db_host" value="127.0.0.1"/>
		[<option name="-db_port" value="TYPE_MYSQL_SERVER_PORT_NUMBER"/>] <!-- optional. 
						Leave out if using 'default' DB port -->
	  </plugin>
	   ...
	<pluginList>
    </import>
    ...

Then run (incremental) import and buildcol normally, like:

(incremental-)import.pl -site <SITENAME> <COLNAME>
(incremental-)buildcol.pl -activate -site <SITENAME> <COLNAME>

Leave out -site <SITENAME> for Greenstone 2, whereas for Greenstone 3, the default sitename is localsite.

You can now preview your Greenstoen collection, run the MySQL client (but after logging in, immediately set the client's database connection to use the utf8mb4 character set to access the database: set names utf8mb4;), then use SQL statements to modify the database contents such as a collection's metadata and finally rebuild and preview your Greenstone collection once more.

b. GLI-SPECIFIC WAY.

In GLI, go to Create > Import Options

  • set the saveas dropdown to GreenstoneSQL,
  • for the saveas_options, copy and paste the following and edit as required:\\
    	-process_mode all -db_driver mysql -db_host 127.0.0.1 -db_client_user root -db_client_pwd TYPE_THE_PASSWORD_HERE

Note that the saveas option's value in GLI's import options is GreenstoneSQL.

c. Full command line version:

import.pl -site localsite -verbosity 3 -saveas GreenstoneSQL -saveas_options "-process_mode all -db_driver mysql -db_host 127.0.0.1 -db_client_user root -db_client_pwd TYPE_THE_PASSWORD_HERE" COLNAME

Note that the saveas option's value in the full command line version is GreenstoneSQL too.

Background: How it works

In general, building a Greenstone collection consists of two phases: the import phase and the buildcol phase. You would have noticed this if you ran the building process manually on the commandline. When using the GreenstoneSQLPlugs, the following happens when building a Greenstone collection:

  • On running import.pl (the import phase), your collection's documents are parsed and the GreenstoneSQLPlugout is used to write out your documents' metadata and/or fulltext into the SQL database instead of into doc.xml files in the collection's archives folder.
  • import.pl creates the contents of the archives folder. "doc.xml" files are still generated in a sense, but they're now called docsql.xml instead, and this time they do not contain metadata and/or fulltext (depending on which of these–metadata and/or fulltext–is sent to the SQL database instead). Instead, when importing is over, each docsql.xml file will contain simple readable text saying that metadata and/or fulltext for each section is to be found in the SQL database.
  • On running buildcol.pl of the buildcol phase, the GreenstoneSQLPlugin is used to
    • access the SQL database to read the metadata and/or fulltext of the stored documents back into memory
    • to read the remainder back in from docsql.xml into memory (the metadata and/or fulltext, whatever was not stored in the SQL database)
  • The GreenstoneSQLPlugin sends each document in memory to the indexer for processing
  • At the end of buildcol.pl, the collection's index folder would have been created and your documents would have been indexed as usual.

You can now preview your built collection as usual, and you can search, browse and view your documents as usual. If you were working on the command line, then you'd either restart your Greenstone server before previewing, or if your collection was re-activated by passing in -activate when running buildcol.pl on the command line, you can just refresh your digital library pages in the browser.

Getting and running MySQL

Newer versions of Mac OS (we tested on MacOS v 10.13/High Sierra) require Building the DBD::mysql package first, so start by going through those instructions. Then proceed from step 2 below.

For Windows and Linux, get the OS appropriate binary zip/tarball of mysql-5.7.23 from https://dev.mysql.com/downloads/mirrors/ (e.g. http://mysql.inspire.net.nz/Downloads/MySQL-5.7/).

We have not tested the Greenstone SQL Plugs against versions of MySQL other than 5.7.23. These instructions for installing and running the MySQL server and client are likewise based on our attempts at using MySQL v 5.7.23. For other versions, refer to MySQL's online documentation.

Installing / One time setup

0. Check no mysql already installed, else there may be conflicts.

  • On Unix (this step tested only on Linux, not Mac)
    ls /tmp/mysql.sock
    ls: cannot access '/tmp/mysql.sock': No such file or directory


    That's good, no MySQL installed and running in the usual system location.

  • On Windows, check the Services and Processes tabs of Task Manager (Ctrl+Shift+Esc).
    • If it's just running as a process, you can terminate the process from the Processes tab
    • If it's running as a service, switch to Administrator by pressing the "Show processes from all users" button in the Task Manager's Processes tab and then in the Services tab, scroll to any MySQL service running, rightclick it and select "Stop Service".

1. Download and installation

  • On Linux (tested on Ubuntu 14.04)
    • You will want the "MySQL Community Server". Download GENERIC linux 64 BIT tarball "mysql-5.7.23-linux-glibc2.12-x86_64.tar.gz" for example from Japan's Download Mirror at http://ftp.jaist.ac.jp/pub/mysql/Downloads/MySQL-5.7/
    • Untar and rename extracted folder to mysql-5.7.23. So for example, you end up with the binary MySQL installation at /home/me/mysql-5.7.23
  • On Windows
    • If you want the zipped up binary version, go to https://dev.mysql.com/downloads/mysql/ (If you want the installer, go to https://dev.mysql.com/downloads/installer/. We've tried the zipped up binary)
    • In either case, scroll down and click on the link "Looking for previous GA versions?". Then choose "Microsoft Windows" for OS and version "5.7.24", and the bitness of your OS architecture (e.g. 64 bit). For the zipped version, choose the regular "Zip Archive".
  • On Macs: Since we've only attempted and tested this with MacOS 10.13/High Sierra, these instructions are tailored to this version of MacOS.
    • Grab a mysql 5.7.23 binary tarball for MacOS 10.13 (High Sierra) from one of the https://dev.mysql.com/downloads/mirrors/
      This will be a *.tar.gz file containing both 5.7.23 and "macos10.13" in its name.
      For instance, if your chosen mirror were http://mysql.inspire.net.nz/Downloads/MySQL-5.7/, you'd visit that page and scroll down to the appropriate tarball,
      ultimately downloading it from mysql.inspire.net.nz/Downloads/MySQL-5.7/mysql-5.7.23-macos10.13-x86_64.tar.gz
    • Next untar the downloaded tarball: tar -xvzf mysql-5.7.23-macos10.13-x86_64.tar.gz
    • There will be a further 2 tarballs inside the extracted folder. Untar the one that doesn't have the word test in its filename:
      cd mysql-5.7.23-macos10.13-x86_64
      tar -xvzf mysql-5.7.23-macos10.13-x86_64.tar.gz
    • You may end up with a folder like /Users/me/mysql-5.7.23-macos10.13-x86_64/mysql-5.7.23-macos10.13-x86_64
      Rename this to /Users/me/mysql-5.7.23:
      mv /Users/me/mysql-5.7.23-macos10.13-x86_64/mysql-5.7.23-macos10.13-x86_64 /Users/me/mysql-5.7.23
    • Apple stopped distributing the MySQL database driver (DBD::mysql) perl package with perl in newer versions of MacOS. For instance, Mac High Sierra machines come with perl 5.18 pre-installed, but do not include the DBD::mysql pacakge to support MySQL. You'll consequently need to install and build this package yourself in such cases.
      First, test whether your MacOS has the DBD::mysql package, and whether it's an acceptable version, by typing the following in a terminal:
      perl -MDBI -e 'DBI->installed_versions';


      - If there's no mention of DBD::mysql in the response output, or if the response output does contain DBD::mysql but the version number displayed alongside it is lower than 4.033, you'll want to first build DBD::mysql before returning to proceed with the rest of these instructions.
      - If DBD::mysql 4.033 is already installed, proceed with the instructions below.

2. Initialisation

a. ON UNIX (Linux and Mac tested)

  • create a folder called errmsgs inside your mysql-5.7.23 installation folder
  • next copy the following file across:
    cp /home/me/mysql-5.7.23/share/english/errmsg.sys /home/me/mysql-5.7.23/errmsgs/.

b. On UNIX AND WINDOWS, create new EMPTY datadir folder:

mkdir /home/me/mysql-5.7.23/data

c. Initialise the server.

  • ON UNIX, initialise the server side by passing in the datadir and the errmsgs dir from step 2a (instructions from here):
    ./bin/mysqld \
       --datadir /home/me/mysql-5.7.23/data \
       --lc-messages-dir /home/me/mysql-5.7.23/errmsgs \
       --initialize


    On Linux, this will print out error messages to the console by default.

  • ON WINDOWS, you must pass in –console at this stage, in order to see the password or any error messages printed out:
    bin\mysqld.exe \
       --datadir C:\Users\me\mysql\data
       --console \
       --initialize

Username root is now created for the mysql server that will run on localhost and it prints an initial password. Make note of this password. For example, if the server was initialised on Linux:

  mymachine:[143]/home/me/mysql-5.7.23/bin> \
    ./mysqld --datadir /home/me/mysql-5.7.23/data --lc-messages-dir /home/me/mysql-5.7.23/errmsgs --initialize
  2018-10-03T05:06:14.425489Z 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. \ 
    Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
  2018-10-03T05:06:14.425542Z 0 [Warning] Using pre 5.5 semantics to load error messages from \
    /home/me/mysql-5.7.23/errmsgs/.
  2018-10-03T05:06:14.425545Z 0 [Warning] If this is not intended, refer to the documentation \
    for valid usage of --lc-messages-dir and --language parameters.
  2018-10-03T05:06:14.748762Z 0 [Warning] InnoDB: New log files created, LSN=45790
  2018-10-03T05:06:14.822112Z 0 [Warning] InnoDB: Creating foreign key constraint system tables.
  2018-10-03T05:06:14.889567Z 0 [Warning] No existing UUID has been found, so we assume that \
    this is the first time that this server has been started. \ 
    Generating a new UUID: 0d9d22c1-c6ca-11e8-8224-484d7ec57ff3.
  2018-10-03T05:06:14.891586Z 0 [Warning] Gtid table is not ready to be used. \
    Table 'mysql.gtid_executed' cannot be opened.
  2018-10-03T05:06:14.893456Z 1 [Note] A temporary password is generated for \
    root@localhost: ryfi_%BUl7Gz <---------- MAKE NOTE OF YOUR PASSWORD APPEARING AT THIS LOCATION

3. Run the MySQL server (passing in –console is necessary on Windows to see any error messages):

./mysqld --datadir /home/me/mysql-5.7.23/data [--console]

4. Change root password ( from here).

Assuming the mysql server is currently running with ./mysqld[_safe] --datadir=/home/me/mysql-5.7.23/data [--console],

Use a terminal to login to the MySQL client:

./mysql -u root -p

You'll be prompted for a password. Type the password from step 3c.

Then at the MySQL client prompt, type the following commands, replacing new_password below with the new password of your choice (but leave the single quotes around it intact):

ALTER USER 'root'@'localhost' IDENTIFIED BY 'new_password';
flush privileges;
quit;

5. Before using the GreenstoneSQLPlugs:
a. Quit the MySQL server as explained here, so you can relaunch it with the necessary character set as explained under Running the MySQL SERVER
b. When connecting the MySQL client next time, make sure to set the correct character encoding needed for Greenstone, as explained under Running the MySQL CLIENT.

Running

Running the MySQL server

When launching the MySQL server

  • Need to always pass in the custom datadir (used in step 2b and 2c above), since our datadir is not the one in the system location that's assumed by default.
  • Furthermore, need to configure the server for 4 byte utf8.

On Windows:

	C:\Users\Me\mysql-5.7.24-winx64> \
	  bin\mysqld \
	    --datadir=C:\Users\Me\mysql-5.7.24-winx64\data \
	    --character_set_server=utf8mb4 \
	    --console

Note the 'd' at the end of "mysql".

On UNIX, run mysqld_safe rather than mysqld this (but again, note the 'd' at the end of "mysql"):

	mymachine:[115]/home/me/mysql-5.7.23/bin> \
	  ./mysqld_safe --datadir=/home/me/mysql-5.7.23/data --character_set_server=utf8mb4

Running the MySQL client

After running the MySQL server as above, can run the client with -u root as follows, to make it prompt for your password. Immediately after logging in with the MySQL client, turn on utf8mb4 character set support before doing anything with the database:

  ./home/me/mysql-5.7.23/bin/mysql -u root -p 
  
  Enter password: [Type your password]
  
  mysql> set names utf8mb4;

Note for Windows users: If you're running the command line client and want to view the contents in utf8, there's some support for unicode display in DOS but with provisos. To set up your DOS for this unicode support:

  1. Right click on the title bar of your DOS prompt, select Properties > Font tab. Then select "Lucida Console" as the font, which can display a range of unicode characters. Then click OK.
  2. Change the codepage of the DOS console to support unicode by typing the following at the DOS prompt before running the MySQL client: chcp 65001

For the provisos on the extent of unicode support at the Windows DOS prompt and further information, consult https://stackoverflow.com/questions/388490/how-to-use-unicode-characters-in-windows-command-line

Quitting

  • Type "exit" or "quit" to quit the MySQL client.
  • To stop the MySQL server:
    • On Linux:
      ./home/me/mysql-5.7.23/bin/mysqladmin -u root -p shutdown


      (don't forget the ./ before the executable).

    • On Windows:
      C:\Users\Anupama\mysql-5.7.24-winx64>bin\mysqladmin.exe -u root -p shutdown

And then type root user's password when prompted.

If at any time that doesn't work, the next option is to try

  • On Linux: kill -TERM pid, or the more extreme kill -9 <pid>, to stop the mysqld server.
  • On Windows, type Ctrl+Shift+Esc to launch the Task Manager. (If you ran the MySQL server as Administrator, you will need to stop the MySQL server service as Administrator, for which you should go into the Processes tab, press the "Show processes from all users" button at the bottom.) Then go to the Services tab, scroll down to MySQL, and for each, right-click and choose "Stop Service".

Building the DBD::mysql package for Mac OS v 10.13 ("High Sierra")

Automated way

The automated way should suffice if you're on a MacOS v 10.13/High Sierra machine. The script need s internet access.

Open a terminal and

for GS3:

cd /your/GS3/gs2build/perllib/cpan

for GS2 (untested):

cd /your/GS2/perllib/cpan

Ensure the compile-dbd-mysql.sh script has execute permissions:

chmod u+x compile-dbd-mysql.sh

Now run it:

./compile-dbd-mysql.sh

The above script will: 1. download and install mysql v 5.7.23 inside your GSDLHOME (the GS3/gs2build folder for GS3 users) 2. download and compile DBD::mysql v 4.033 against the above mysql installation's libraries and header files

Once the script has finished running

  • if you want, you can move the $GSDLHOME/mysql-5.7.23 into another location on your machine.
  • Follow the MySQL installation instructions from step 2 onwards to finish setting up the MySQL server to work with Greenstone on MacOS High Sierra.
    Once that's done, you can run the MySQL server and finally start using the Greenstone SQL Plugs in a new collection by editing its collection config file.

Manual way

These instructions are for installing the perl package DBD::mysql version 4.033 on MacOS 10.13/High Sierra.

The automatic script above takes cares of all this. But if there are any issues with that, or if you want to attempt the same process of building the DBD::mysql package with different versions of either perl, DBD::mysql or mysql itself, then these steps may be useful.

1. Get the DBD::mysql source package. Since on Windows we worked with DBD::mysql version 4.033, and we've now successfully built and tested that version on a High Sierra Mac using the Greenstone Plugs, we recommend the same version:

2. The remaining instructions follow select steps from https://metacpan.org/pod/distribution/DBD-mysql/lib/DBD/mysql/INSTALL.pod#BINARY-INSTALLATION

First create a directory called mysql-static and copy all the *.a files in your MySQL installation's lib subfolder into it:

mkdir /tmp/mysql-static
cp /Users/me/mysql-5.7.23/lib/*.a mysql-static/.

Note that if your Mac came with a MySQL pre-installed, you'd be copying from the system installed mysql lib location, which would have been installed with a slightly different folder structure:

cp /usr/lib/mysql/*.a /tmp/mysql-static/.

3. Now you're finally ready to compile up the downloaded DBD::MySQL package. Here we continue to assume that your that MySQL binary is installed at /Users/me/mysql-5.7.23

a. Create a temporary folder where your DBD::mysql will temporarily end up installed.
I created it inside my Greenstone binary at $GSDLHOME/perllib/cpan/tmp_mysql

Wherever you created this folder, you will use the full path to it further below.

b. Go into your untarred DBD::mysql v 4.033 folder: cd /Users/me/DBD-mysql-4.033

c. Run the following 3 commands after first editing - PREFIX to be the full path to the temporary folder you created in step 3a, and - the full paths to your MySQL 5.7.23 installation used in --cflags and --libs below

perl Makefile.PL \
    PREFIX=$GSDLHOME/perllib/cpan/tmp_mysql \
    --cflags=-I/Users/me/mysql-5.7.23/include \
    --libs="-L/Users/me/mysql-5.7.23/mysql-static -lmysqlclient"

make

make install

4. Finish up installing.

a. You need to have a Greenstone installation for this step. Source the greenstone setup script (gs3-setup.sh for GS3 and setup.bash for GS2) to set up the Greenstone environment, including the GSDLHOME environment variable.

b. Step 3 above should have generated folders and files in the temporary folder you created in step 3a. Go into that temporary folder and move 2 particular subfolders into your $GSDLHOME/perllib/cpan location as follows:

cd $GSDLHOME/perllib/cpan/tmp_mysql

mv lib/perl5/site_perl/5.18.2/darwin-thread-multi-2level/DBD gs2build/cpan/perl-5.18/.
mv lib/perl5/site_perl/5.18.2/darwin-thread-multi-2level/auto/DBD gs2build/perl-5.18/auto/.

c. Optional: you can check that the compiled up DBD::MySQL package was statically linked against the mysql client with the following command:

otool -L  $GSDLHOME/perllib/cpan/auto/DBD/mysql/mysql.bundle

The following output indicates that the only dynamic library that the generated DBD::mysql bundle file references is a Mac system library. Since there's no mention of the libmysqlclient.dylib, we can feel confident that it was statically linked into our bundle file:

auto/DBD/mysql/mysql.bundle:
	/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1226.10.1)

5. Test that DBD::MySQL is now installed. In a new terminal, first set up the PERL5LIB environment variable to point to the cpan folders where you installed the DBD::mysql products in step 4. Then you can check whether your DBD::mysql is now detected:

export PERL5LIB=$GSDLHOME/perllib/cpan:$GSDLHOME/perllib/cpan/perl-5.18:$PERL5LIB
perl -MDBI -e "DBI->installed_versions";

(Since High Sierra macs come with perl 5.18 pre-installed, the perl binary packages that we created for this went into cpan/perl-5.18)

You should now see DBD::mysql listed among the output. For example, in my case:

  Perl            : 5.018002    (darwin-thread-multi-2level)
  OS              : darwin      (17.0)
  DBI             : 1.631
  DBD::mysql      : 4.033 <---------------------- SUCCESS!
  DBD::Sponge     : 12.010003
  DBD::SQLite     : 1.40
  DBD::Proxy      : install_driver(Proxy) failed: Can't locate RPC/PlClient.pm in @INC (you may need to install the RPC::PlClient module)
  DBD::Gofer      : 0.015327
  DBD::File       : 0.42
  DBD::ExampleP   : 12.014311
  DBD::DBM        : 0.08

6. Clean up the temporary folders we no longer need:

rm -rf $GSDLHOME/perllib/cpan/tmp_mysql
rm -rf /tmp/mysql-static

7. Follow the MySQL installation instructions from step 2 onwards to finish setting up the MySQL server to work with Greenstone on MacOS High Sierra.

Once that's done, you can run the MySQL server and finally start using the Greenstone SQL Plugs in a new collection by editing its collection config file.

TODO:

  • Discuss table columns, column names and column types. Although users could probably see this from the table's schema once logged into the MySQL client.
  • Have not yet tested writing out just meta or just fulltxt to sql db and reading just that back in from the sql db while the remainder is to be read back in from the docsql .xml files.
  • Test on windows and mac.