User Tools

Site Tools


en:user_advanced:greenstonesqlplugs

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

en:user_advanced:greenstonesqlplugs [2019/06/07 08:52] – [The short version: Usage instructions] anupamaen:user_advanced:greenstonesqlplugs [2023/03/13 01:46] (current) – external edit 127.0.0.1
Line 1: Line 1:
 +
 +
 +
 ====== Editing archives metadata and/or fulltext using a MySQL database ====== ====== Editing archives metadata and/or fulltext using a MySQL database ======
 Like the GreenstoneMETSPlugin, the GreenstoneSQLPlugin is another alternative to using the GreenstoneXMLPlugin. Like the GreenstoneMETSPlugin, the GreenstoneSQLPlugin is another alternative to using the GreenstoneXMLPlugin.
Line 7: Line 10:
  
 **Important Notes:** **Important Notes:**
-  * None of this has been tested with Greenstone 2, only Greenstone 3. But it should in theory work with Greenstone 2 as well. Greenstone 2 users are invited to email the mailing list if there are any issues. UPDATE: Some testing for GS2 performed after fixing issues, as identified on mailing list. User seemed to succeed in using the GS SQL Plugs with an GS2 nightly binary.+  * None of this has been tested with Greenstone 2, only Greenstone 3. But it should in theory work with Greenstone 2 as well. Greenstone 2 users are invited to email the mailing list if there are any issues. UPDATE: Some testing for GS2 performed after fixing issues, as identified on mailing list. User seemed to succeed in using the GS SQL Plugs with GS2 nightly binary.
   * GreenstoneSQLPlugin and GreenstoneSQLPlugout rely on the **DBI** and DBD::mysql perl packages.   * GreenstoneSQLPlugin and GreenstoneSQLPlugout rely on the **DBI** and DBD::mysql perl packages.
   * We've so far tested the GreenstoneSQLPlugs on Ubuntu Linux, Windows 7 64 bit and MacOS v.10.13/High Sierra with **mysql version 5.7.23 and perl DBI version 1.634 and perl DBD::mysql version 4.033.** We found DBD::mysql version 4.033 to have the necessary support for UTF8, whereas somewhat earlier versions didn't. (For this reason we're upgrading the version of Strawberry Perl from 5.18 to 5.22 which we include with Windows Greenstone binaries here onward, starting with GS3.09)   * We've so far tested the GreenstoneSQLPlugs on Ubuntu Linux, Windows 7 64 bit and MacOS v.10.13/High Sierra with **mysql version 5.7.23 and perl DBI version 1.634 and perl DBD::mysql version 4.033.** We found DBD::mysql version 4.033 to have the necessary support for UTF8, whereas somewhat earlier versions didn't. (For this reason we're upgrading the version of Strawberry Perl from 5.18 to 5.22 which we include with Windows Greenstone binaries here onward, starting with GS3.09)
Line 13: Line 16:
      * For Windows binaries, we provide you with a Strawberry Perl that has the correct versions of DBI and DBD::mysql. If you're working with Greenstone source distributions or source code from svn on Windows, you can grab Strawberry Perl 5.22 [[http://trac.greenstone.org/export/32658/main/trunk/release-kits/shared/windows/perl.zip|from here]], which includes versions of these packages that we tested the Greenstone SQL Plugs successfully against.      * For Windows binaries, we provide you with a Strawberry Perl that has the correct versions of DBI and DBD::mysql. If you're working with Greenstone source distributions or source code from svn on Windows, you can grab Strawberry Perl 5.22 [[http://trac.greenstone.org/export/32658/main/trunk/release-kits/shared/windows/perl.zip|from here]], which includes versions of these packages that we tested the Greenstone SQL Plugs successfully against.
      * For newer versions of Mac, your pre-installed DBI package may be fine, but you may be missing DBD::mysql. Or perhaps you have an older version of DBD::mysql. In that case, refer to the Mac instructions at [[http://wiki.greenstone.org/doku.php?id=en:user_advanced:greenstonesqlplugs#getting_and_running_mysql | Getting and running MySQL]] on how to get DBD::mysql on Mac 10.13/High Sierra.      * For newer versions of Mac, your pre-installed DBI package may be fine, but you may be missing DBD::mysql. Or perhaps you have an older version of DBD::mysql. In that case, refer to the Mac instructions at [[http://wiki.greenstone.org/doku.php?id=en:user_advanced:greenstonesqlplugs#getting_and_running_mysql | Getting and running MySQL]] on how to get DBD::mysql on Mac 10.13/High Sierra.
-     * **UPDATE:** If you're on Linux or on a Mac High Sierra machine, and you don't have DBI or DBD, also refer to the instructions at [[http://wiki.greenstone.org/doku.php?id=en:user_advanced:greenstonesqlplugs#building_the_dbdmysql_package_for_mac_os_v_1013_high_sierra | Building the DBD (and DBI) perl packages]]. Compiling DBD and the Greenstone SQL plugs' use of it were already tested for the Mac. The compiling process for //DBI// has now been tested successfully, but nothing more: not tested that the plugs run after compiling up DBI in this way on a machine that didn'used to have DBI.+     * **UPDATE:** If you're on Linux or on a Mac High Sierra machine, and you don't have DBI or DBD, also refer to the instructions at [[http://wiki.greenstone.org/doku.php?id=en:user_advanced:greenstonesqlplugs#building_the_dbdmysql_package_for_mac_os_v_1013_high_sierra | Building the DBD (and DBI) perl packages]]. Compiling DBD and the Greenstone SQL plugs' use of it were already tested for the Mac. The compiling process for //DBI// has now been tested successfully, but nothing more: not tested that the plugs run after compiling up DBI in this way on a machine that didn'use to have DBI.
  
 You can check you have the DBI and DBD::mysql packages installed as part of your perl in whatever OS you're working on. To test you have DBI and DBD::mysql installed and also find their versions, run the following command in perl: You can check you have the DBI and DBD::mysql packages installed as part of your perl in whatever OS you're working on. To test you have DBI and DBD::mysql installed and also find their versions, run the following command in perl:
Line 33: Line 36:
   * After creating a new Greenstone collection, open up ''collectionConfig.xml'' for editing, as in the example snippet below   * After creating a new Greenstone collection, open up ''collectionConfig.xml'' for editing, as in the example snippet below
      * add in an element for the **GreenstoneSQLPlugout**      * add in an element for the **GreenstoneSQLPlugout**
-     * //replace// the GreenstoneXMLPlugin element with one for **GreenstoneSQLPlugin**. (It should notably take GreenstoneXMLPlugin's place near the top)+     * **//replace// the GreenstoneXMLPlugin** element with one for **GreenstoneSQLPlugin**. (It should notably take GreenstoneXMLPlugin's place near the top)
      * **The configure options for both GreenstoneSQLPlugs are the same and need to be set consistently for both.**\\ Snippet of example configuration in ''collectionConfig.xml'':\\ <code>      * **The configure options for both GreenstoneSQLPlugs are the same and need to be set consistently for both.**\\ Snippet of example configuration in ''collectionConfig.xml'':\\ <code>
     <import>     <import>
Line 66: Line 69:
 </code>\\ //Optional// options are in [square brackets].  </code>\\ //Optional// options are in [square brackets]. 
  
-  * Run (incremental-)import and (incremental-)build normally:\\ <code>(incremental-)import.pl -site <SITENAME> <COLNAME> +  * Once the collectionConfiguration.xml file has been correctly setup, you can run GLI, open the hand configured collection in there, and then run build in there.\\ Alternatively, you can use the commandline and run (incremental-)import and (incremental-)build normally:\\ <code>import.pl -site <SITENAME> <COLNAME> 
-(incremental-)buildcol.pl -activate -site <SITENAME> <COLNAME></code>\\ **IMPORTANT: On Windows**, precede the import and buildcol commands with ''perl -S''.\\ The default sitename is ''localsite''. Leave out the ''-site <SITENAME>'' for Greenstone 2.+buildcol.pl -activate -site <SITENAME> <COLNAME></code>where COLNAME is the collection name.\\ Or if building incrementally:<code>incremental-import.pl -incremental -site <SITENAME> <COLNAME> 
 +incremental-buildcol.pl -activate -site <SITENAME> <COLNAME></code>Alternatively, you can run both import and buildcol steps in one go:<code>full-rebuild.pl -site <SITENAME> <COLNAME></code>And if incrementally building in one go:<code>incremental-rebuild.pl -site <SITENAME> <COLNAME></code>\\ **IMPORTANT: On Windows**, precede the import and buildcol or rebuild commands with ''perl -S''.\\ The default sitename is ''localsite''. Leave out the ''-site <SITENAME>'' for Greenstone 2.
   * Start your GS3 server if you haven't already and preview your collection.   * Start your GS3 server if you haven't already and preview your collection.
-  * You can now at any time [[#running|run your MySQL client]] **in utf8mb4 mode** then use it to access modify the contents of the SQL database as you wish (such as using SQL statements to mass-edit metadata) and then rebuild the collection with the changed values in effect:+  * You can now at any time [[#running|run your MySQL client]] **in utf8mb4 mode** then use it to access and modify the contents of the SQL database as you wish (such as using SQL statements to mass-edit metadata) and then rebuild the collection with the changed values in effect:
      * But **once you log into MySQL client, always first set the connection to use the ''utf8mb4'' character set** before creating or loading databases and tables:\\ <code>mysql> set names utf8mb4;</code>      * But **once you log into MySQL client, always first set the connection to use the ''utf8mb4'' character set** before creating or loading databases and tables:\\ <code>mysql> set names utf8mb4;</code>
-     * The GreenstoneSQLPlugs create a **database** called ''<SITENAME>'' for GS3 and called ''greenstone2'' for GS2.+     * The GreenstoneSQLPlugs create a **database** called ''<SITENAME>'' for GS3 (which defaults to ''localsite''and called ''greenstone2'' for GS2.
      * Up to 2 **tables** are created for each collection: ''<COLNAME>_metadata'' and ''<COLNAME>_fulltxt'' (note spelling), where hyphens '-' in <COLNAME> are replaced by underscores '_'.      * Up to 2 **tables** are created for each collection: ''<COLNAME>_metadata'' and ''<COLNAME>_fulltxt'' (note spelling), where hyphens '-' in <COLNAME> are replaced by underscores '_'.
  
 **Notes on GreenstoneSQLPlugin/out configuration options:** **Notes on GreenstoneSQLPlugin/out configuration options:**
   * Set ''process_mode'' to one of ''meta_only'', ''text_only'' or ''all'', depending on whether you want only metadata, only full text or both to be stored in your MySQL database for each document in your collection. The remainder will be stored in the ''docsql.xml'' file per document in your collection's ''archives'' subfolder.   * Set ''process_mode'' to one of ''meta_only'', ''text_only'' or ''all'', depending on whether you want only metadata, only full text or both to be stored in your MySQL database for each document in your collection. The remainder will be stored in the ''docsql.xml'' file per document in your collection's ''archives'' subfolder.
-  * The values for ''-db_driver'', ''-db_client_user'', ''-db_client_pwd'', ''-db_host'' and the optional ''-db_port'' are the connection parameters you use when running the MySQL client against the running MySQL server.\\ - Where there are default values for options, the defaults are set for the value attribute in the example below. Adjust the options' values as appropriate for you.+  * The values for ''-db_driver'', ''-db_client_user'', ''-db_client_pwd'', ''-db_host'' and the optional ''-db_port'' are the connection parameters you use when running the MySQL client against the running MySQL server.\\ - Where there are default values for options, the defaults are set for the value attribute in the example collection configuration snippet above. Adjust the options' values as appropriate for you.
   * Experimental feature: If you set ''rollback_on_cancel'' to true, then if you cancel building during the import.pl or first phase of building, the database would remain unchanged since the start of //that// script. It has no real effect during the latter phase of building, which runs buildcol.pl, since buildcol.pl does not attempt to change the database, merely reading content back from MySQL. When running with the ''rollback_on_cancel'' option enabled, you will be reminded that to keep your file system in sync with database changes, you would need to manually backup the collection's ''archives'' and/or ''index'' folders.   * Experimental feature: If you set ''rollback_on_cancel'' to true, then if you cancel building during the import.pl or first phase of building, the database would remain unchanged since the start of //that// script. It has no real effect during the latter phase of building, which runs buildcol.pl, since buildcol.pl does not attempt to change the database, merely reading content back from MySQL. When running with the ''rollback_on_cancel'' option enabled, you will be reminded that to keep your file system in sync with database changes, you would need to manually backup the collection's ''archives'' and/or ''index'' folders.
  
Line 140: Line 144:
      <option name="-db_driver" value="mysql"/>      <option name="-db_driver" value="mysql"/>
      <option name="-db_client_user" value="root"/>      <option name="-db_client_user" value="root"/>
-     <option name="-db_client_pwd" value="6reenstone3"/>+     <option name="-db_client_pwd" value="TYPE_YOUR_MYSQL_PASSWORD_HERE"/>
      <option name="-db_host" value="127.0.0.1"/>      <option name="-db_host" value="127.0.0.1"/>
      [<option name="-db_port" value="TYPE_MYSQL_SERVER_PORT_NUMBER"/>] <!-- optional.       [<option name="-db_port" value="TYPE_MYSQL_SERVER_PORT_NUMBER"/>] <!-- optional. 
Line 152: Line 156:
  <option name="-db_driver" value="mysql"/>  <option name="-db_driver" value="mysql"/>
  <option name="-db_client_user" value="root"/>  <option name="-db_client_user" value="root"/>
- <option name="-db_client_pwd" value="TYPE_THE_PASSWORD_HERE"/>+ <option name="-db_client_pwd" value="TYPE_YOUR_MYSQL_PASSWORD_HERE"/>
  <option name="-db_host" value="127.0.0.1"/>  <option name="-db_host" value="127.0.0.1"/>
  [<option name="-db_port" value="TYPE_MYSQL_SERVER_PORT_NUMBER"/>] <!-- optional.   [<option name="-db_port" value="TYPE_MYSQL_SERVER_PORT_NUMBER"/>] <!-- optional. 
Line 167: Line 171:
 Leave out ''-site <SITENAME>'' for Greenstone 2, whereas for Greenstone 3, the default sitename is ''localsite''. Leave out ''-site <SITENAME>'' for Greenstone 2, whereas for Greenstone 3, the default sitename is ''localsite''.
  
-You can now preview your Greenstoen collection, [[#running|run the MySQL client]] (but after logging in, immediately set the client's database connection to use the ''utf8mb4'' character set to access the database: ''set names utf8mb4;''), then use SQL statements to modify the database contents such as a collection's metadata and finally rebuild and preview your Greenstone collection once more.+You can now preview your Greenstone collection, [[#running|run the MySQL client]] (but after logging in, immediately set the client's database connection to use the ''utf8mb4'' character set to access the database: ''set names utf8mb4;''), then use SQL statements to modify the database contents such as a collection's metadata and finally rebuild and preview your Greenstone collection once more.
  
 b. GLI-SPECIFIC WAY. b. GLI-SPECIFIC WAY.
en/user_advanced/greenstonesqlplugs.1559897544.txt.gz · Last modified: 2019/06/07 08:52 (external edit)