User Tools

Site Tools


en:release:2.86_release_notes
This version is outdated by a newer approved version.DiffThis version (2015/09/11 04:07) is a draft.
Approvals: 0/1

This is an old revision of the document!


Release Name: 2.86

Release Date: 31 October 2013

Released:

  • Greenstone v2.86: The official Greenstone 2.86. 31 Oct 2013
    Binaries for Windows, GNU/Linux 32 and 64 bit, Mac Leopard and Mountain Lion. (Mac version generated on both 10.5.8 and 10.8.5 and different tests carried out on each machine)
    svn tag page trac tag page. Code revision up to 28576 (28574 for all except Lion). Tag revision: 28579.

Release Candidate History:

  • Greenstone v2.86rc3: Release Candidate 3. 14 Oct 2013
    Binaries for Windows, GNU/Linux 32 and 64 bit, Mac Leopard and Mountain Lion. (Mac version generated on both 10.5.8 and 10.8.5 and different tests carried out on each machine)
    svn tag page trac tag page. Code revision up to 28436 (28381). Tag revision: 28436.
  • Greenstone v2.86rc2: Release Candidate 2. 11/12 Dec 2012
    Binaries for Windows, GNU/Linux 32 and 64 bit, Mac Leopard.
    svn tag page trac tag page. Code revision 26569. Tag revision: 26575. (Mac version generated on 10.5.8 and briefly tested on 10.7.5)
  • Greenstone v2.86rc1: Release Candidate 1. 04 December 2012
    Binaries for Windows, GNU/Linux 32 and 64 bit.
    svn tag page trac tag page. Tag revision: 26551. Actual binary release revision: 26544. The revisions are slightly later for the 64 bit linux and later still for the Mac Leopard binary.

Installation Instructions

There's a choice between:

  • installing binaries, which are precompiled. Choose the one for your Operating System:
    • Windows 32 bit binary is for both 32 and 64 bit Windows machines
    • Linux 32 bit binary for 32 bit Linux architectures
    • Linux 64 bit binary for 64 bit Linux architectures
    • MacOS binary for Leopards (10.5) and Snow Leopards (10.6)
    • MacOS binary for Mountain Lions (10.8) will probably work on Lions (10.7) as well
  • topping up your binary with the source code if you ever decide you want the source: download the source component,
  • going straight for the source code to compile Greenstone from scratch: download the source distribution

Binary distribution

Upon downloading the installer, run the executable: On Windows and Mac you need to double-click it to launch the installation dialog, on Linux you first need to set the downloaded executable's permissions to executable before you can run it from the terminal.

It may take some time for the Greenstone installation dialog to appear. Once the installation dialog displays, you generally need to keep pressing the Next button until it is finished. However, when it asks for the location to install Greenstone in, make sure to choose a location on your file system for which you have access privileges. If you want to install Greenstone into C:\Program Files on Windows 7 or Windows Vista you will need to run the installer with administrator permissions (this can be achieved by right clicking on the installer and choosing "Run as administrator"). And if you wish to use the Greenstone Administration pages (which will be needed if you want to create user accounts for a Remote Greenstone server), then now is a good time to set a sensible password for that.

  • The installer initially unpacks into a temporary directory (/tmp on linux). Set the TMPDIR environment variable to change this.
  • The windows version can be installed anywhere, including paths with spaces and brackets (these caused a problem in releases prior to 2.85).
  • The Linux and Mac versions must be installed into a path with *no* spaces.

When the installation process is finished, you can run the Greenstone Server or the Greenstone Librarian Interface (GLI):

1. On Windows, the included Greenstone Server can be launched from the shortcut in the Start menu. On Mac and Linux, use a terminal (in Macs this is found under Applications > Utilities > Terminal) to go into the Greenstone installation directory and run

./gs2-server.sh

The small Greenstone Server Interface (GSI) dialog will display. Pressing its Enter Library button will open a browser onto your Greenstone Digital Library home page.(*)

By default, the web servers restrict access to Greenstone pages to the local machine. To change this, go to File > Settings in the Greenstone Server Interface dialog, and tick "Allow external connections". Click OK to save the settings, then press the Restart Library button. (**)

Note: The Windows version of Greenstone includes two server applications: server.exe and an apache web server. (Linux and Mac versions of Greenstone include only the apache web server). By default, the server.exe application is launched when you use the Windows Start menu shortcut to launch the server. To use the apache web server included with the Windows version of Greenstone, do one of the following:

  • Rename the server.exe executable found in your Greenstone installation folder. Then the Start menu shortcut will run the Apache web server instead of server.exe. If you are using GLI, GLI will then also start up Apache instead of server.exe. Alternatively you can start the web server by running gs2-server.bat (located in your Greenstone installation folder).
  • Run gs2-web-server.bat - this will start the Apache web server, but won't affect the Start menu shortcut, or GLI's behaviour.

To change the language in which you view your Greenstone digital library pages, click on the Preferences link at the top left of your Greenstone digital library home page. On the Preferences page, select the interface language in the drop down box.

2. The Greenstone Librarian Interface (GLI) can be run from the Windows Start menu. On Mac and Linux, use a terminal to go into the Greenstone installation directory and run

./gli/gli.sh

First, as in (1) above, the Greenstone Server Interface (GSI) dialog will appear. Eventually the Greenstone Librarian Interface (GLI) dialog will appear. Refer to the Greenstone tutorials for examples of using the GLI to create collections of documents. Once you have finished creating a collection, you can preview it by pressing the Preview button from GLI's Create tab. It will open your Greenstone collection in the web browser.(*) (**)

(*) If the web page displays a "Forbidden" message instead, go back to the GSI dialog, and use its File > Settings menu to change the Address Resolution method to one of the other options there. Then press the Restart Library Button in the main GSI dialog and see whether the browser page it opens now is the Greenstone home page. Otherwise try another Address Resolution option from the GSI dialog's Settings menu and see whether the pages are visible now.

(**) If you have your own external web server that you wish to use, then in your Greenstone installation directory, rename the folder apache-httpd to something else. Alternatively, you can rename the file gs2-server.sh (if on Linux/Mac) or gs2-server.bat (if on Windows) to something else.

To change the GLI interface language, run GLI, go to the File > Preferences menu. Then in the General tab, set the Interface Language. If your script is not covered by the Latin 1 charset, then you may also need to set the Font to something that supports your script. In such a case, try setting the value for the Font field to Arial Unicode MS, BOLD, 12.

3. The Client-GLI is the version of the Greenstone Librarian Interface that can be run on a machine different to the one that is running the Greenstone server. To be able to run the Client-GLI application, you will need Sun Java 1.5.0 or greater installed and you will need to have:

  • Java's bin folder on your PATH
  • JAVA_HOME set to point to your Java installation folder

If you follow Java's installation instructions, they will direct you on how to add the Java installation's bin folder to your system's PATH environment variable and how to set the JAVA_HOME environment variable.

If on Windows, you can run client-GLI from its shortcut in the Start Menu. On Linux and Mac systems, you would use a terminal to go into your Greenstone installation folder and then run

./gli/client-gli.sh

When the client-GLI starts up, a small dialog appears asking you to enter the URL of the remote Greenstone server's gliserver.pl file. This URL generally has the form: http://<host>:<port>/greenstone/cgi-bin/gliserver.pl, where you have to fill in the host and port values for the remote Greenstone server. After clicking OK, the client-GLI application window will appear. Client-GLI looks and works just like the GLI, except that most of the document processing takes place on the remote machine where the Greenstone server is running.

  • If you wish to work with password protected collections, here's a workaround to the bug of constantly requiring to authenticate yourself.
  • To get your Greenstone installation set up as a remote server so that other GLI clients can connect to it, refer to the section Working with Remote Greenstone and the GLI-Client.

Source Components and Source Distributions

There are two ways to get Greenstone 2.86's source code in a compressed format (zip or tar.gz file):

1. If you didn't install a Greenstone binary version, you would get the Greenstone Source Distribution which contains the (uncompiled) source code.

2. If you have downloaded and installed the Greenstone binary version already, you would only need to top up your installation with the source code by getting the Source Component. You would then extract this in your Greenstone directory:

To compile the Greenstone source code, you need an appropriate compiler:

  • XCode for Macs,
  • the GNU compiler for Linux
  • Visual C++ (Visual Studio) and Microsoft/Windows Platform SDK for Windows machines.

If you're working with the source distribution, you will further also need

  • Java SDK 1.6 or later
  • Perl 8 or later. For windows, you can get ActivePerl.

<TABAREA tabs="Source comp Linux/Mac, Source comp Win, Source dist Linux/Mac, Source dist Win"> <TAB>

  1. Download the Source Component tar.gz file that matches with your Greenstone binary version, and put it in your Greenstone installation folder.
  2. Use a terminal to extract the downloaded file's contents into your Greenstone installation folder:
    cd <your greenstone folder>
    tar -xvzf <source-componentfile>
  3. Move the ext/gnome-lib-minimal out of the way or rename it to something else.
  4. Make sure JAVA_HOME is set and the bin folders for Java and Perl are on your PATH.
  5. If you want to compile up gnome-lib yourself, skip this step. If you want to use a pre-compiled gnome-lib binary (to save on all the time of compiling gnome-lib), download the gnome-lib-minimal package for your OS by visiting http://trac.greenstone.org/browser/gs2-extensions/gnome-lib/trunk
    Then unzip the downloaded gnome-lib minimal package into your greenstone2-home/ext folder.
  6. Run the following, which will get gnome-lib and compile it up as it's compiling your Greenstone:
     ./makegs2.sh gnome-lib
    cd gli
    ./makegli.sh
    ./makejar.sh


    Note that some Linux machines don't need gnome-lib at all, in which case, the first compilation step above would just be ./makegs2.sh. To tell whether your Linux machine needs gnome-lib, try compiling it without it first. If compilation fails during wvware, then you need gnome-lib. The Mac Mountain Lion and Leopard machines we tested it on required gnome-lib.

</TAB> <TAB> It's handy to create a batch script to set the environment. Create a file containing the following, make sure to replace the paths with your own, and save the file as setupenv.bat

@echo off

:: Script to set up Java, perl and Visual Studio for compiling C/C++ in GS2

set JAVA_HOME=C:\Program Files\Java\jdk1.7.0_25
if not exist "%JAVA_HOME%" (
echo %JAVA_HOME% not found. Exiting...
goto done
)

:: Add the bin folders of Perl and Java to your PATH
set PATH=C:\Users\Me\perl\bin;%JAVA_HOME%\bin;%PATH%

:: If you want to compile GS2 with debugging on, you also need MS SDK and the following line:
:: call "C:\Program Files\Microsoft SDKs\Windows\v6.1\Bin\SetEnv.cmd"

:: Set up Visual studio environment. vcvars<num>.bat may be called vsvars<num>.bat
call "C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\bin\vcvars32.bat"

:done
  1. Get the source component zip file from the downloads page.
  2. Unzip it in your Greenstone installation. If Windows prompts you about whether you want existing folders merged (and existing files replaced), tick the box to confirm for all and click in the affirmative.
  3. Open a DOS prompt
  4. Set up the environment for compiling Greenstone by running the setupenv.bat script described above: setupenv.bat
  5. Go into your Greenstone installation: cd /path/greenstone2
  6. Run the makegs2 script: makegs2.bat for 32 bit windows or makegs2x64.bat for 64 bit windows
  7. It will prompt you whether to extract certain important files. Type Y to do so.
  8. It will next present you with various compilation options. You want to type 4 ("All") to tell it to compile everything.
  9. It will take some minutes to compile after which, if there are no errors, you can start running GLI or the gs2-server.
  10. If you want to re-compile GLI go into your Greenstone's gli subfolder: cd gli. Next, type: makegli.bat. To re-compile the GLI jar files, such as used for Remote Greenstone situations, type: makejar.bat.

</TAB> <TAB>

  1. Download Source Distribution.
  2. If you want to compile up gnome-lib yourself, skip this step. If you want to use a pre-compiled gnome-lib binary (to save on all the time of compiling gnome-lib), download the gnome-lib-minimal package for your OS by visiting http://trac.greenstone.org/browser/gs2-extensions/gnome-lib/trunk
    Then unzip the downloaded gnome-lib minimal package into your greenstone2-home/ext
  3. Make sure JAVA_HOME is set and the bin folders for Java and Perl are on your PATH.
  4. Run the following, which will get gnome-lib and compile it up as it's compiling your Greenstone:
    ./makegs2.sh gnome-lib
    cd gli
    ./makegli.sh
    ./makejar.sh


    Note that some Linux machines don't need gnome-lib at all, in which case, the first compilation step above would just be ./makegs2.sh. To tell whether your Linux machine needs gnome-lib, try compiling it without it first. If compilation fails during wvware, then you need gnome-lib. The Mac Mountain Lion and Leopard machines we tested it on required gnome-lib.

  5. You will need to enable the Administration pages if you want access to them. Do so by editing your Greenstone installation's etc/main.cfg file. Change the status field value from disabled to enabled. In that case, you may also want to change the admin password for the Adminstration pages. Use a DOS prompt to run: gsicontrol.bat configure-admin which will allow you to (re)set the password for username admin (the default admin password is the same as the username).

</TAB> <TAB> It's handy to create a batch script to set the environment. Create a file containing the following, make sure to replace the paths with your own, and save the file as setupenv.bat

@echo off

:: Script to set up Java, perl and Visual Studio for compiling C/C++ in GS2

set JAVA_HOME=C:\Program Files\Java\jdk1.7.0_25
if not exist "%JAVA_HOME%" (
echo %JAVA_HOME% not found. Exiting...
goto done
)

:: Add the bin folders of Perl and Java to your PATH
set PATH=C:\Users\Me\perl\bin;%JAVA_HOME%\bin;%PATH%

:: If you want to compile GS2 with debugging on, you also need MS SDK and the following line:
:: call "C:\Program Files\Microsoft SDKs\Windows\v6.1\Bin\SetEnv.cmd"

:: Set up Visual studio environment. vcvars<num>.bat may be called vsvars<num>.bat
call "C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\bin\vcvars32.bat"

:done
  1. Get the source distribution zip file from the downloads page.
  2. Unzip it
  3. Open a DOS prompt
  4. Set up the environment for compiling Greenstone by running the setupenv.bat script described above: setupenv.bat
  5. Go into your Greenstone installation: cd /path/greenstone2
  6. Run the makegs2 script: makegs2.bat for 32 bit windows or makegs2x64.bat for 64 bit windows
  7. It will prompt you whether to extract certain important files. Type Y to do so.
  8. It will next present you with various compilation options. You want to type 4 ("All") to tell it to compile everything.
  9. It will take some minutes to compile after which, if there are no errors, you can start running the gs2-server.
  10. If you want to run GLI as well, this will need to be compiled. To compile it, go into your Greenstone's gli subfolder: cd gli. Next, type: makegli.bat. Once it's done, you can run gli from the commandline with gli.bat from inside the gli folder, or with gli\gli.bat from the toplevel Greenstone installation folder.
  11. If you wish to compile up the GLI jar files, such as for Remote Greenstone situations, run the following from within the gli folder: makejar.bat.
  12. You will need to enable the Administration pages if you want access to them. Do so by editing your Greenstone installation's etc/main.cfg file. Change the status field value from disabled to enabled. In that case, you may also want to change the admin password for the Adminstration pages. Use a DOS prompt to run: gsicontrol.bat configure-admin which will allow you to (re)set the password for username admin (the default admin password is the same as the username).

</TAB> </TABAREA>

Running the installer in text-only mode

  1. Give the binary of the installer execute permissions
  2. Then run it by passing in the text-only flag.
  3. Follow the instructions on the screen thereafter. If you mistype at any stage, press ctrl-C to start again.
> ./Greenstone-2.86rc2-linux text-only
----------------------------
Extracting java installer...
----------------------------

Extraction Complete
You can now run "java -jar greenstone.jar text" to run the installer from the command line
>

Setting the Preview Command in GLI

If you've installed Greenstone and are running GLI (the Greenstone Librarian Interface application) for the first time, and have just finished builing your first collection with it, GLI may not know what to do when you press the Preview Button. If it complains or does nothing when you press the Preview Button, you will need to tell it how to launch your default browser (and tell it to open on the collection page) upon pressing Preview.

The following specifies the commands you are likely to need. Paste the applicable one into GLI's File > Preferences menu > Connection tab > Preview Command field.

  • On Windows:
cmd.exe /c start "" "%1"
  • On Mac:
open %1

Put %1 in quotes if your Greenstone installation path contains spaces.

  • On Linux systems:
firefox %1

If you work with another browser, then type the command you'd use to launch that from the terminal, suffixed with %1 once again. (Embed %1 in quotes if you've installed Greenstone in a path containing spaces.) NOTE: If GLI's Preview Button does not succeed in launching the browser with the collection URL, consult this page for a suggested solution.

Uninstallation

On Windows, the uninstaller is accessible from the Start menu.

For most people under Linux systems, a Greenstone installation can be removed with the usual rm command. However, by using this method, any collections you've created will also be deleted. If you're on Linux or Mac and wish to uninstall Greenstone, the recommended way to do so is by using the Uninstaller, as this will give you the option to retain your collections. To launch the Uninstaller, you can either run "bash uninstall.sh" from the uninstall folder, or will first need to give execute permissions to the uninstall/Uninstall.sh file in your Greenstone installation before you can run it:

cd uninstall
chmod u+rx Uninstall.sh
./Uninstall.sh

Important Changes since 2.85

  • Deals with several security vulnerabilities that were present in previous versions of Greenstone 2 and which were recently brought to our notice.
  • RSS support (see below)
  • DSpace-inspired Depositor. It also allows multiple values for various metadata fields such as author and subject keywords. See this page for how to enable it.
  • Ascending/descending sort option added for lucene collections. Search sort fields are now defined separately rather than just using the indexes list. In collect.cfg:
indexes dc.Title dc.Subject 
sortfields dc.Date 
sections_sort_on_document_metadata unless_section_metadata_exists # (a buildcol option) 
  • New option for List classifier: standardize_capitalization. Metadata values are lowercased for sorting into bookshelves. With this option set, then the lowercase value will be used as the bookshelf Title. This can then be changed to title case/upper case etc using CSS. Without the option set, then the majority case variant will be used. For example, with three values (snail, Snail, Snail) then Snail will be used.
  • Improvements to authentication when super collections (cross collection searching) are used. When there is no authentication needed for the entered collection, any collections that have collection-level authentication will not be searched. When the entered collection was authenticated, then for each collection we check the collection's groups against the list of groups for the user, and if there is a match, then the collection will be searched, otherwise it will not be. Note that document-level authenticated collections will always be searched, as the authentication happens when you click on a document link. This only works with the link to the greenstone document, and cannot be used to restrict access to eg pdf files.
  • Year only dates in a DateList used to crash the local library server. Fixed thanks to DL Consulting. The DateList reverse_sort option now works properly when years are grouped together. [srclink] now works for DateList.
  • Bug fix for remote GLI where moving files would fail with long filenames.
  • GLI format editing box now provies coloured highlighting and Ctrl-Z for undo.
  • bugfix - top level metadata.xml not being assigned to files under Windows
  • Greenstone now works on 64 bit linux. windows??
  • OAI downloading - will now download all records unless max_records option is set.
  • support added for translation using Google Translation Toolkit
  • new script activate.pl - deactivates the collection in a running server, moves building to index, then reactivates. Can be called by itself, is included at end of full-rebuild.pl, or can be run after building by adding -activate to the buildcol.pl command.
  • Lucene upgraded to 3.3.0 (was 2.3.2)
  • mico upgraded to 2.3.13 and modified (was 2.3.5) - used for CORBA interface
  • wget upgraded to version 1.13.4 (was 1.11.4)
  • new options for List classifier: -filter_metadata and -filter_regex. Only include documents in the list if they contain the specified filtering metadata, and if a filter_regex is specified, that metadata value must match the regular expression.
  • DateList reverse sort bug fixed
  • Code updated to allow cross compilation. Eg compile on Linux to produce Windows binaries. Support for compiling on Android NDK.
  • MARCPlugin - will now extract all occurances of a subfield, not just the first one.
  • Ability to add user comments to document pages (see below).
  • OAI supercollections - makes sets containing one or more collections
  • Bugfix to powerpoint processing when windows_scripting is turned on and pagedimg types as the output
  • New XSL transformation file (in gli/classes/xml/xsd-to-mds.xsl) that can be used as a helpful template when needing to convert a metadata schema that is in XSD format to Greenstone's metadata set format.
  • New CDWA Lite metadata set for GLI, generated by applying xsd-to-mds.xsl to CDWALite's XSD (http://www.getty.edu/CDWA/CDWALite/CDWALite-xsd-public-v1-1.xsd).
  • There are now 2 further settings for the -OIDtype option to import.pl:
         
hash_on_full_filename:     Hash on the full filename to the
                           document within the 'import' folder (and not its
                           contents).  Helps make document identifiers more
                           stable across upgrades of the software, although it
                           means that duplicate documents contained in the
                           collection are no longer detected automatically.

full_filename:             Use the full file name within the
                           'import' folder as the identifier for the document
                           (with _ and - substitutions made for symbols such as
                           directory separators and the fullstop in a filename
                           extension)

The first one is particularly important, because it ensures that you will have stable urls no matter how many times you rebuild the collection. It allows you to share a link and not have to worry about hash changes. (Many thanks to Diego Spano for remembering and documenting this feature)

(gli and greenstone notes to rev 28160)

To turn on RSS support:

For GS2.86, go to GLI's Format panel's Collection Specific Macros, and paste the following:

package Global
_optrsslink_ {
_rsslink_
}

Alternatively, you can just paste the above directly into your collect/<collection-name>/macros/extra.dm file

There is a further change to be made if you are working with collections with underscores in their names. Pascal D. Angst on the mailing list noticed that such collections caused issues with the RSS links produced for their documents. He came up with a helpful fix to the apache webserver's Rewrite rules (in bold below). His fix, which is shown in lines 4 and 5 in the two snippets below, may be more generally applicable. To apply his fix to all relevant instances, in a text-editor open the Greenstone installation's apache-httpd/<your-operating-system>/conf/httpd.conf.in file.

Change:

     RewriteEngine On
     RewriteRule ^([A-Za-z0-9-]+)/about/?$         /greenstone/cgi-bin/library.cgi?c=$1&a=p&p=about [L]
     RewriteRule ^([A-Za-z0-9-]+)/query/?$         /greenstone/cgi-bin/library.cgi?c=$1&a=q [L]
     RewriteRule ^([A-Za-z0-9-]+)/document/([^/]+)$  /greenstone/cgi-bin/library.cgi?c=$1&a=d&d=$2 [L]
     RewriteRule ^([A-Za-z0-9-]+)/document/(.*?)/(.*)$  /greenstone/cgi-bin/library.cgi?c=$1&a=d&d=$2&$3 [L]
     RewriteRule ^([A-Za-z0-9-]+)/$                /greenstone/cgi-bin/library.cgi?c=$1&a=p&p=about [L]
     # RewriteRule ^([A-Za-z0-9-]+)/(.*?)$         /greenstone/cgi-bin/library.cgi?c=$1&$2 [L]

To the following (there is an underscore at the end of each A-Za-z0-9-):

     RewriteEngine On
     RewriteRule ^([A-Za-z0-9-_]+)/about/?$         /greenstone/cgi-bin/library.cgi?c=$1&a=p&p=about [L]
     RewriteRule ^([A-Za-z0-9-_]+)/query/?$         /greenstone/cgi-bin/library.cgi?c=$1&a=q [L]
     RewriteRule ^([A-Za-z0-9-_]+)/document/([^/]+)$  /greenstone/cgi-bin/library.cgi?c=$1&a=d&d=$2 [L]
     RewriteRule ^([A-Za-z0-9-_]+)/document/(.*?)/(.*)$  /greenstone/cgi-bin/library.cgi?c=$1&a=d&d=$2&$3 [L]
     RewriteRule ^([A-Za-z0-9-_]+)/$                /greenstone/cgi-bin/library.cgi?c=$1&a=p&p=about [L]
     # RewriteRule ^([A-Za-z0-9-_]+)/(.*?)$         /greenstone/cgi-bin/library.cgi?c=$1&$2 [L]

Enabling user comments

For each collection, you can decide whether to allow user comments to be added to it or not. By default this ability is turned off. To turn it on for any collection:

  • Open the collection in GLI and go to Format panel > Format Features
  • Under Choose Feature, select Allow User Comments
  • Press the Add Format button
  • Tick the Enabled button
  • If you had already built the collection, you can Preview it straight away. Any document page in the collection should provide a link in green at the bottom labelled Add Comment. If you're not already logged in, you will be taken to a log in screen and, once authenticated, you can proceed to add a comment. Once you're done leaving comments, you can press the green Logout link.

Users need to have an account in your digital library in order to add comments.

For those migrating from earlier versions

This section contributed by John Rose.

Some of the new Greenstone features which facilitate the creation of institutional repositories and other open access collections:

1. OAI server

Your collections can easily be made available for remote harvesting using OAI-PMH protocol, which works silently in parallel with normal web access to the collections. All that you have to do is to add a bit of configuration data in the oai.cfg text file in the etc subdirectory under the Greenstone home directory. The data to specify is explained in comment lines in the above file. If the collections to be made available through OAI-PMH do not all use Dublin Core metadata or one of the two other standard OAI metadata sets, the oai.cfg file will need to contain mapping data to translate your metadata into one of the Greenstone OAI-PMH metadata sets (also explained in the comments to the oai.cfg file).

OAI-PMH support has been provided for some time by Greenstone, but there have previously been a few functional gaps, as well as a bug in version in 2.84. From version 2.85 onwards, all official OAI-PMH validation criteria have been tested and satisfied; you will be able to validate your own OAI-PMH server using instructions given in the release notes. If you don't specify the urls for the associated documents in the metadata, the system can automatically generate internal urls so that users can access the full documents from the harvested OAI records. You will also now be able to harvest OAI-PMH records and the associated documents residing in external Greenstone collections (in 2.84, harvesting worked to access information in non-Greenstone collections, but there was problem in harvesting from other Greenstone collections).

Much information is put up on the web without clear specification of the concerned intellectual property rights. Although this is not good practice in general, when activating the OAI server special care should be taken to ensure that your documents are really available under open access conditions (in the public domain or freely distributable and re-distributable under an open access license such as Creative Commons). Greenstone can only take care of the technical access - for legal and organisational considerations, prospective open access providers may consult, for example, the resource links of the EIFL Open Access programme (http://www.eifl.net/eifl-oa-resources).

Once your OAI server is operational, to provide maximal international visibility for your open access collections you should register them in at least one (and ideally all) of the following: the ROAR directory (http://roar.eprints.org/), the OAI directory (http://www.openarchives.org/Register/BrowseSites) and the OpenDOAR directory (http://www.opendoar.org/). It would also be very nice if you could confirm to this list that your server is operational, providing the url base address.

2. PDF metadata

Prior to version 2.83, reliable import of, and metadata extraction from, pdf files was limited to PDF versions 1.4 and earlier. Starting with 2.84 a new "PDF Box extension" has been available as a separate download to handle all PDF versions. This extension file need only be placed in the ext subdirectory of Greenstone for the improved PDF handling facilities to be operational (see the release notes). The PDF Box extension has been further improved from version 2.85 onwards, so be sure to download, unzip and insert an up-to-date PDF Box extension for this version, replacing the version of the file which you may have downloaded for version 2.84.

By using the PDF Box extension, you can extract any metadata entered in standard manner in a pdf file, i.e. the traditional pdf metadata (Author, Title, Subject, Keywords) and/or the newer XMP format metadata (including user defined fields). In general, we recommend that for users interested in extracting PDF metadata, it is better to use the PDF Box extension, even for pdf files in version 1.4 or earlier.

Using the PDF metadata extraction facility means that for PDF files generated by the users with metadata included (either directly with a tool like Acrobat, or by generating a PDF file from a package like Word which can transfer Word metadata to the generated PDF file), these metadata can be automatically incorporated into a Greenstone collection (without having to enter them in GLI or compile a metadata.xml file). This could clearly be of interest to open access applications, particularly when decentralized input is being submitted.ext subdirectory of Greenstone for the improved PDF handling facilities to be operational (see the release notes)

There is a catch: the metadata extraction procedure may not work flawlessly on recent version PDF files which are not "linearised" (called Fast Web View in Acrobat). So linearised PDF files should be used; the open source QPDF program (http://qpdf.sourceforge.net/) claims to be able to linarise non-linearised PDF files, but this remains to be confirmed in so far as Greenstone treatment is concerned. Feedback from users on the PDF metadata extraction facility is most welcome.

3. Section handling for PDF files

For several years Greenstone has proposed a facility to automatically generate internal section (chapter) information from a Microsoft Office (e.g. Word), OpenOffice/Libreoffice or html document, but probably not for those PDF files which do not have a standard way of designating chapter/section headings. This automatic generation of internal section (chapter) information allows for table of contents display of the document and finer chapter-based searching.

Word files can be treated in this way if a compatible version of Word is installed in the computer in which a collection is built (see the tutorial at http://wiki.greenstone.org/wiki/gsdoc/tutorial/en/enhanced_word.htm). Word, Office Open XML or OpenDocument format files can also be treated without proprietary software if OpenOffice or LibreOffice is installed, by downloading the Greenstone OpenOffice extension into the ext subdirectory of the Greenstone installation (see the release notes), and activating the open office option in the Word (or Powerpoint, or Excel) plugin of Greenstone (similar to activating on Windows/Word scripting option as in the above mentioned tutorial).

An example collection has now been prepared to show how this can be extended to PDF files (see http://www.nzdl.org/gsdlmod?a=p&p=about&c=assocext-e). Included is an explanation of how to build the collection through the following steps: a. develop a Word version and a PDF version of the document (conversion of the Word version to PDF or vice-versa); b. make sure that the heading formats in Word are consistent with what you want for sections and subsections; c. import the Word file into Greenstone specifying the PDF file as an associated file; d. use the format statement guidance in the worked example to be able to search on the document subsections and also display the hit terms in the original PDF file (Word or OpenOffice/LibreOffice no longer needed after building - the collection could for example in the meantime have been transferred to a Linux server).

An alternative, more controllable but more labour intensive, method without recourse to word processing software would be to import the pdf file into Greenstone, right click in the Gather view and convert it to html, call an html editor and ensure that the section information is correctly introduced, add the pdf again but as an associated file (by setting the assoc-files parameter in HTMLPlugin), then build and display as per the worked example.

More complete documentation is being developed for all of the above techniques, and we will keep you informed on its progress.

4. Migrating to 2.86

To switch to version 2.86 from an earlier Greenstone version with minimal risks, you could i) back up your collections, ii) install 2.86 in a new home directory (to specified to the installer), and iii) copy the collect sub-directory from the old to the new version. If you are presently using a recent previous version of Greenstone (2.8x), the collections should be immediately available for use; if not, particularly for collections built under older versions of Greenstone, it should suffice to rebuild the collections under the new version. Any problems can be addressed to this list or the main Greenstone users list (http://list.waikato.ac.nz/mailman/listinfo/greenstone-users).

If you want to transfer information on users and user groups, the corresponding databases (users.gdb, key.gdb) should be copied from the etc sub-directory in the old collection to the new one. Of course if you have customised your previous version (main.cfg, style.css, macros, etc.), the old versions should also be copied to the new installation. When all is working perfectly, the old installation can be deleted.

Further Notes on Installation and Running

Apache HTTPD Notes

Greenstone binary releases come with the Apache HTTPD web server precompiled and installed by default into Greenstone/apache-httpd.

  • To uninstall it, delete the Greenstone/apache-httpd folder. To disable the use of it, rename the apache-httpd folder to something else (then you can rename it back if you change your mind later).
  • If you have an existing Apache web server installed and you want to set it up to serve your Greenstone, copy the appropriate bits out of Greenstone's Apache httpd.conf file into your existing Apache's httpd.conf file. Then disable (or uninstall) Greenstone's Apache as described above.
  • If you want to use an alternative webserver, then set it up appropriately, and disable the Greenstone Apache server.
  • If you had installed Apache Httpd previously for the sole purpose of serving Greenstone, then you may like to uninstall it and use the one installed by Greenstone.

Additional notes to compiling manually

On Windows, use a DOS prompt to go into your Greenstone installation folder. You will need Visual C++ (either from Visual Studio or the Express version) and you may also need the Windows/Microsoft Platform SDK installed. FIRST run the Platform SDK's SetEnv.Cmd (if you have it). THEN run Visual C++'s vcvars32.bat (or vsvars32.bat). Now you can compile manually:

  • To compile up server.exe, run the following commands (each takes several minutes)
nmake /f win32.mak
nmake /f win32.mak LOCAL_LIBRARY=1
  • If you only want to compile up the apache web server, type:
nmake /f win32.mak APACHE_HTTPD=1
  • If you wish to clean the files generated during compilation (both intermediate files and binaries), type:
nmake /f win32.mak clean
  • Note that if you wish to compile things up (or clean) for debugging, then in all the above commands you would append
DEBUG=1

On Linux and Mac, configuring and compiling generally takes the form:

./configure
make
make install
  • By default, Greenstone is compiled with accent folding turned on. To disable it, you would run the configure step as follows:
./configure --disable-accentfold

As stated in the installation instructions, to compile the included apache web server, the configure step needs to be:

./configure --enable-apache-httpd
  • You can get rid of the files generated by compilation by using a Terminal to go into your Greenstone installation folder and running:
make clean

To clean all the files generated during both compilation AND configuration (all config files, other intermediate files and binaries), you would run the following instead:

make distclean

Additional notes to running Greenstone on Windows

On Windows, running the Greenstone Librarian Interface (GLI) or the Greenstone Server Interface (GSI) manually from a DOS prompt could be useful in diagnosing anything that goes wrong, since it keeps any messages that were displayed during program execution visible in the DOS window.

To run GLI or GSI from the DOS prompt, first go into your Greenstone installation directory and then

  • to run the GSI, type:
gs2-server.bat
  • to run GLI, type:
gli\gli.bat
  • If you have trouble running gs2-server.bat (For example, getting the error "Could not find the main class: org.greenstone.server.Server2. Program will exit."), then you can run gsicontrol.bat instead. (See further below.)

Notes on using a Terminal or DOS prompt

On Macs, the Terminal is an application that can be found under Applications > Utilities > Terminal.

On Windows, you can start up a DOS prompt by going to Start > Run and then typing cmd.

To go to your Greenstone installation directory using your terminal, you would type:

cd <here you'd type the full path to your Greenstone installation folder>

On Windows you would use backslashes (\) in file paths, and on Linux and Mac you would use forward slashes (/).

On Linux and Mac, to run a shell script (Greenstone's shell scripts are files that end on *.sh or *.bash), you would precede the scriptname with a ./

On Windows, to run a batch script (files that end on *.bat), just type its name out in full.

E.g. on Windows:

cd C:\Greenstone
gs2-server.bat

E.g. on Linux or Mac:

cd /home/me/greenstone
./gs2-server.sh

Using the gsicontrol script

The gsicontrol.sh/bat script is used by gs2-server.sh/bat, and provides much functionality: you can use it to change port number, start and stop the Apache web server, etc. It accepts many parameters like:

 web-start 
 web-stop
 web-restart 
 
 configure-admin
 configure-web
 configure-apache 
 configure-cgi 
 reset-gsdlhome 
 
 set-port 
 
 test-gsdlhome 
 web-stop-tested

You can use it as in the following example

  • In a command window, go to your Greenstone installation folder and run setup.bat (if on Windows) or 'source setup.bash' (for Linux/Mac)
  • Then run "gsicontrol.sh/bat set-port". It will ask for a port number. Use e.g. 8282 (just to avoid conflicts with standard ports).
  • Now run "gsicontrol.sh/bat web-start". (You would use the ".sh" extension on Linux and Mac machines and ".bat" on Windows machines.) Doing so will run the Apache web server.
  • In a browser, enter "http://localhost:8282". It should show the message "It works" indicating that Apache is running.
  • Then type "http://localhost:8282/greenstone/cgi-bin/library.cgi". It should show the Greenstone home page.
  • To stop the webserver at any point, from your command window run "./gsicontrol.sh web-stop" on linux/mac and "gsicontrol.bat web-stop" on windows.
  • If you move your Greenstone 2.86 installation folder to another location at any point, then (with the server still stopped), you would need to run "./gsicontrol.sh reset-gsdlhome" on Linux and "gsicontrol.bat reset-gsdlhome" on Windows.
  • If you forgot the admin password (as is required to access the Administration Pages and to use Remote GLI), this can be reset by running "./gsicontrol.sh configure-admin" on Linux and "gsicontrol.bat configure-admin" on Windows.

Notes on using GLI

In GLI's File > File Associations, you can set which applications are to be called to open files with different file extensions.

  • On Mac, you can type open %1 for all of these, which then lets the default application on the Mac open the file extension associated with each file. You may need to find the right command for you version of *Nix.
  • To do the same on Linux, type xdg-open %1 (or if you are specifically on a gnome system, then use gnome-open %1, while on a kde system you'd use kde-open %1).
  • To do the same on Windows, type cmd.exe /c start "" "%1".

Working with Remote Greenstone and the GLI-Client

Instructions

These instructions are more Greenstone 2.86-specific than the general instructions for setting up Greenstone 2 as a remote server.

The following are steps to follow if you're on Windows. On Linux, you can skip steps 1 and 2, otherwise things are similar. For instance, you'll want to launch *.bash or *.sh script equivalents to the batch files listed. Also, you'll want to use forward slashes (/) instead of the Windows' backward slash (\) when specifying file paths.

1. If the path to your Greenstone installation contains any spaces (i.e. if any of the containing folders wherein your Greenstone is ultimately located contain spaces in their names), open cgi-bin/gsdlsite.cfg in a plain text editor and make sure that the value for the GSDLHOME line contains quotes around it. E.g.

gsdlhome "C:\Program Files\Greenstone2"

Save any changes.

2. Rename server.exe in your Greenstone installation folder to something else, say "_server.exe".

This is because you will need to use the included Apache web server for the remote Greenstone. By renaming the default library server in Greenstone 2, Greenstone will next look for the apache web server.

3. Now run the Apache web server included with your Greenstone from the Windows Start Menu, or by opening a DOS prompt and typing the path to your Greenstone 2 installation and then running the gs2-server script. E.g.

cd C:\Program Files\Greenstone2
gs2-server.bat

Alternatively, you could use Windows Explorer to locate the gs2-server.bat file in your Greenstone2 installation folder and double click that file.

4. A dialog (the Greenstone Server Interface) will display. If you want remote clients accessing your Remote Greenstone Server, go to File > Settings of this dialog and tick "Allow External Connections" and choose either "Get local IP and resolve to a name" or "Get local IP". Finally, press the dialog's central Enter Library button.

It will open a browser and take you to a page like: http://localhost/greenstone/cgi-bin/library.cgi

OR: http://<YOUR-MACHINE-NAME:YOURPORT>/greenstone/cgi-bin/library.cgi (where if port were the default 80 it won't be displayed, e.g. http://<YOUR-MACHINE-NAME>/greenstone/cgi-bin/library.cgi)

5. Replace the "library.cgi" part of the URL in the browser with:

gliserver.pl?cmd=check-installation

E.g. http://localhost/greenstone/cgi-bin/gliserver.pl?cmd=check-installation (OR: http://<YOUR-MACHINE-NAME:YOURPORT>/greenstone/cgi-bin/gliserver.pl?cmd=check-installation)

At the end of the browser page, it is imperative that it says something like:

"Installation OK!"

(If not check the error messages.)

6. You may need to give read and write access to the collect folder. Once again, open a DOS prompt. Type the following, or the equivalent of the following for your computer's locale, but make sure to type the path to *your* Greenstone2 installation (the example below uses C:\Program Files\Greenstone2\collect):

cacls "C:\Program Files\Greenstone2\collect" /P Everyone:F

On Linux you would do:

chmod -R a+rw /my/path/to/my/Greenstone2.86/collect

(If on Vista or Windows 7, you installed Greenstone in an Admin area, such as in Program Files, then you would need change the security settings of the collect directory: Right-click > Properties, then set the folder to "Everyone".)

7. Use the browser to go to your Greenstone home web page again.

  • Now click on the Administration Page link and add a new user:
  • Click the Add a New User link to the left
  • You'll be requested for the admin username (type "admin") and password, which will be what you chose upon installing Greenstone.

8. Enter the username and password for the new user.

  • In the Groups field, type "personal-collections-editor".
  • Press the Submit button.

9. You can connect to this server from the Client-GLI application included with any Greenstone installation. Either on the current machine or another machine (assuming you want the Greenstone server on one machine and the client on another), use the "Remote Librarian Interface (Client-GLI)" shortcut to launch Client-GLI. Alternatively, you can launch it from the command line, such as by opening a new DOS prompt, going to the gli folder of your Greenstone 2 installation, and running client-gli.bat. E.g.

cd C:\Program Files\Greenstone2\gli
client-gli.bat

10. A dialog will eventually appear asking you for the URL of the Remote Greenstone server's gliserver.pl file.

  • If your client-gli is running from a different machine to where your Greenstone server is running, you need to specify the name of that remote machine hosting the Greenstone server: http://<YOUR-MACHINE-NAME:YOURPORT>/greenstone/cgi-bin/gliserver.pl
  • If the client-gli is running on the same machine, you can generally type "localhost": http://localhost/greenstone/cgi-bin/gliserver.pl

11. It will next ask you for a username and password. Type the values you entered for the new user you created in step 8.

12. The client-GLI dialog should finally open, and it will look and behave mostly the same as the usual (local) GLI.

Setting up your Greenstone OAI Server and using GLI to download over OAI from a Greenstone server

Useful resources

To get the Greenstone 2's OAI Server to work, edit the file etc/oai.cfg in your GS2 installation directory and provide values for the properties repositoryName and repositoryId. E.g.

repositoryName "Greenstone" 
repositoryId "greenstone"

In addition, you will need to edit the same etc/oai.cfg file to also list any collections you want served over OAI. Add such collections by name to the oaicollection property. For example, if you have a Greenstone collection called "oaipdf" and want it served over OAI, then you would append its name to the property oaicollection as follows:

oaicollection demo documented-examples/oai-e oaipdf

If you're validating your OAI server and wish to test the resumptionToken, also set the resumeafter property to be a much lower value. E.g.

resumeafter 5

For each collection meant to be served over OAI, edit the collection's etc/collect.cfg file by filling in necessary data. At a minimum, the email values for the creator and maintainer fields need to be set to something sensible if you want to validate your OAI server against an online validator:

creator <type your email here> 
maintainer <type your email here>

Moreover, if you wish to validate your OAI server against the OpenArchives validator, you will need to remove the default admin email (greenstone@…) in the creator and maintainer email fields from the collect/demo/etc/collect.cfg file. Check that the "greenstone@…" dummy email is not present in the collect.cfg file of any other demonstration collection that's listed in greenstone's oai.cfg file.

If you wish to validate your OAI server, visit http://www.openarchives.org/Register/ValidateSite.

However, before you can validate your OAI server and before you can try testing if GLI can download over OAI from it, there are a few things you need to do to make your Greenstone OAI server accessible to the outside world:

  • If you're on windows, you will likely need to use the included Apache web server intead of the Local Library Server. To change over to using the Apache web server, change the file extension of server.exe which is found in the top-level your Greenstone installation (you can rename it to server.not for instance).
  • Your Greenstone server machine's firewall and virtual server (port-forwarding) settings may need to be set up such that the Greenstone server can be made accessible to the outside world. This is not something Greenstone can do, you will need to do this.
  • Once that's ready, you also need to tell the Greenstone server that it should make itself accessible to the outside world by turning on the "Allow External Connections" option in the File > Settings dialog of the Greenstone Server Interface application. In the same File > Settings dialog, choose the setting that allows the server to use the machine's hostname or host IP instead of "localhost".
  • Press the Enter library button in the Greenstone Server Interface, and it should open the browser on the home page of your digital library. You will see a URL like:
http://<host>:<port>/greenstone/cgi-bin/library.cgi

Change it to

http://<fully-qualified-host>:<port>/greenstone/cgi-bin/oaiserver.cgi

Note that you want the full hostname including domain. This is the URL you will want to feed into the OAI Validator at http://www.openarchives.org/Register/ValidateSite If the URL is not accepted for some reason, try pasting it in a new browser window, and suffix "?verb=Identify" to it, to see what the OAI validator gets to see:

http://<fully-qualified-host>:<port>/greenstone/cgi-bin/oaiserver.cgi?verb=Identify

For further information on your Greenstone OAI Server, read through OAI.

Notes on setting up your Linux system to work with filename encodings alien to your filesystem settings

UTF-8 is a common encoding used in filesystems and for data content.

If you are working on a UTF-8 system, then Java (and therefore GLI) will not give you access to files that do not have UTF-8 filenames. This means that in GLI, attempts to drag and drop files with names that are not UTF-8 will fail on such systems.

GLI will allow one to drag and drop files if the filesystem encoding was set to something that preserved the byte values of filenames (instead of destructively replacing characters that are not valid for the filesystem encoding with an "invalid" character, as happens with UTF-8 systems). In practice, this means that a filesystem encoding such as "Native Latin 1" (also called ISO-8859-1), which is a subset of Unicode, will preserve the underlying byte values in filenames, allowing you to drag and drop all sorts of filenames in GLI.

Drag And Drop in GLI will work by default on Windows since it is not a UTF-8 filesystem, but rather one that has a large overlap with Native Latin 1.

However, some Linux systems are set to UTF-8 by default, while others do not even have other encodings installed so you can't switch over.

The solution to making GLI work with "alien filename encodings" on such Linux systems is to switch the encoding to Native Latin 1 (this is regardless of what encoding your filenames are in). Where this is not installed, you would require Admin rights' to install Native Latin 1, before switching to it. The following contains instructions on doing both. Note that switching between installed encodings does not require Admin rights.

INSTALLING AND APPLYING A NEW FILESYSTEM ENCODING ON A LINUX MACHINE:

The instructions are derived from the thread of questions and answers openjdk and this page at Ubuntu Forums.

First find out whether you are already working with a Linux system set to Native Latin 1 (ISO-8859-1). Check by typing the following in an x-term:

locale -k LC_CTYPE | grep charmap

If the settings are indeed set to Native Latin 1, it should tell you that (en_US.)ISO-8859-1 is active.

A) INSTALLATION OF A NEW FILESYSTEM ENCODING (Native Latin 1/ISO-8859-1):

Installation of Native Latin 1 (ISO-8859-1), which requires Admin rights, may not be required: check if this encoding is already installed on the machine first. You can check by running the following two commands in an x-term:

export LC_ALL=en_US.ISO8859-1 
export LANG=en_US.ISO8859-1

If it doesn't come back with any messages that look like failure (such as the encoding not being found), then it is installed and should now be active. Otherwise you need Admin permissions to install Native Latin 1 (ISO-8859-1) on your Linux system, as follows:

1. Open /var/lib/locales/supported.d/local as Root user and, at the bottom of the file, add the line:

en_US.ISO-8859-1 ISO-8859-1

2. Repeat the above step with the file /var/lib/locales/supported.d/en

3. Optional: Only if you wish to make the Native Latin 1 encoding the system default would you need to open /etc/default/locale as Root and change LANG="en_US.UTF-8" to LANG="en_US". (Or possibly LANG="en_US.ISO-8859-1".)

4. Then in an x-term, run the following to install the new encoding:

sudo locale-gen --purge

5. Restart the machine.

The above 5 steps need to be carried out once for en_US.ISO-8859-1 (Native Latin 1) to be supported by the machine. You would still need to apply the new encoding.

B) APPLYING THE NEWLY INSTALLED ENCODING AS THE FILESYSTEM (AND DISPLAY) ENCODINGS:

6. Having restarted the machine, to make the newly-installed encoding the active one, run the following commands in an x-term again. You do not need Admin rights for issuing the following two commands:

export LC_ALL=en_US.ISO8859-1 

export LANG=en_US.ISO8859-1

7. You can check if it all worked by running:

locale -k LC_CTYPE | grep charmap

Or by running:

locale

It should tell you that (en_US.)ISO-8859-1 is active.

8. Now run GLI from the same x-term to allow it to work with the Native Latin 1 filesystem encoding settings.

Using Greenstone Plugin Extensions to process docx files and recent versions of PDF

Two extensions are available for download: Open Office and PDF Box, to process more recent versions of MS office documents and PDF document respectively.

OpenOffice

  • The Open Office extension provides a document conversion facility if Open Office or LibreOffice is already installed on the system. In order to use the Open Office extension,
  • You will need Open Office installed. You may need to create an environment variable called SOFFICE_HOME and set this to the full path of your OpenOffice or LibreOffice installation folder, if:
    • you're on Windows and have OpenOffice/LibreOffice installed in a location other than "C:\Program Files\OpenOffice.org 3". In that case, also ensure that your PATH environment variable contains the path to the "program" folder located in your SOFFICE_HOME path (the OpenOffice installation folder).
    • you're on Linux and have OpenOffice or LibreOffice installed in a location different from "/opt/openoffice.org3" or "/usr/lib/openoffice" (or "/usr/lib/libreoffice").
  • Once you have Open Office set up, download the Greenstone extension for it from here, which is available in tar.gz and zip formats, and unzip into Greenstone's ext folder. (If you have any issues try the latest version located here. Note that if you get the latest version of the open office extension, you cannot already have an instance of OpenOffice running when using GLI, you will need to terminate any previously running instance. It is also unlikely that you can get a separate instance of OpenOffice running after quitting GLI. If you wish to do so, you will need to use Task Manager to terminate the open office process launched by the extension upon running GLI.)
  • Before you can use this (or any other Greenstone extension), you will need to quit GLI and GS2-server if either are open and then you will need to relaunch GLI (or run Greenstone scripts) from a fresh command terminal, in order for the extension to become available in the Greenstone environment.
  • With OpenOffice and the extension installed and the Greenstone environment set up for this, Greenstone's Word, PowerPoint and Excel Plugins will have a new option, "-openoffice_conversion", allowing conversion with Open Office instead of the existing converter. Switching on this new option means that more recent Office formats like docx can be included in Greenstone collections and processed by Greenstone.

PDFBox

  • The PDF Box extension provides support for conversion of PDF documents to text. It supports the latest PDF versions (unlike Greenstone's standard pdftohtml program), so is useful for collections with new PDF documents.
  • Download the extension from here, which is available in tar.gz and zip formats, and unzip into Greenstone's ext folder. The PDF Box extension does not require additional software to be installed.
  • Before you can use the extension, you will need to quit GLI and GS2-server if either are open and then you will need to relaunch GLI (or run Greenstone scripts) from a fresh command terminal, in order for the extension to become available in the Greenstone environment.
  • PDFBox generates HTML documents from the PDF that may contain more whitespace between lines and paragraphs than you'd wish. In such a case, you can fix this on a per-collection basis using GLI. Open your collection in GLI, go to the Format panel, select Format Features to the left and DocumentText to the right. In the text-area for HTML Format String below, create an HTML style element to set the top and margin bottoms on a paragraph element to 0. You need to escape curly braces with a back slash. In the end your format statement for DocumentText will look like: <br/>
<style>
  p \{
    margin-top:0;
    margin-bottom:0;
  \}
</style>
[Text]

Usage history

If you wish to get some usage output into a file:

1. Open up etc/main.cfg and edit the line that says

logcgiargs      false

to say:

logcgiargs      true

Save the file.

2. You will need to manually create a new text file called "usage.txt" in Greenstone's etc folder. This step will not be necessary in future versions of Greenstone.

3. Run the web server and usage.txt should become populated with information.

IMPORTANT information

Security

If you are using your own Apache web server, the users database file is publicly accessible, allowing access to its passwords. To prevent access to the users database and other database files in your Greenstone 2 installation, in a text editor, edit your Greenstone folder's

  • apache-httpd/linux/conf/httpd.conf.in file if you're on Linux
  • apache-httpd/windows/conf/httpd.conf.in file if you're on Windows

If you want to use a different Apache web server in place of the one included with Greenstone, you'll want to set the ScriptAlias and Alias for your Greenstone folders in your apache httpd.conf file as below. The aim is to deny access to all parts of Greenstone except the folders web, collect and cgi-bin, tmp and bin/java (for the GLIApplet):

  ScriptAlias /greenstone/cgi-bin "**GSDLHOME**/cgi-bin/**GSDL_OS_ARCH**"
  <Directory "**GSDLHOME**/cgi-bin/**GSDL_OS_ARCH**">
     Options None
     AllowOverride None
     Order deny,allow
     **CONNECTPERMISSION** from all
     Allow from 127.0.0.1 **HOST_IP** **HOSTS** localhost
  </Directory>

  Alias /greenstone/collect "**COLLECTHOME**"
  <Directory "**COLLECTHOME**">
     Options Indexes MultiViews FollowSymLinks
     AllowOverride None
     Order deny,allow
     **CONNECTPERMISSION** from all
     Allow from 127.0.0.1 **HOST_IP** **HOSTS** localhost 

     RewriteEngine On
     RewriteRule ^([A-Za-z0-9-]+)/about/?$         /greenstone/cgi-bin/library.cgi?c=$1&a=p&p=about [L]
     RewriteRule ^([A-Za-z0-9-]+)/query/?$         /greenstone/cgi-bin/library.cgi?c=$1&a=q [L]
     RewriteRule ^([A-Za-z0-9-]+)/document/([^/]+)$  /greenstone/cgi-bin/library.cgi?c=$1&a=d&d=$2 [L]
     RewriteRule ^([A-Za-z0-9-]+)/document/(.*?)/(.*)$  /greenstone/cgi-bin/library.cgi?c=$1&a=d&d=$2&$3 [L]
     RewriteRule ^([A-Za-z0-9-]+)/$                /greenstone/cgi-bin/library.cgi?c=$1&a=p&p=about [L]
     # RewriteRule ^([A-Za-z0-9-]+)/(.*?)$         /greenstone/cgi-bin/library.cgi?c=$1&$2 [L]
  </Directory>  
  
  # Deny access to all except collect, web, tmp and bin/java (for GLI-applet) folder
  # http://httpd.apache.org/docs/2.2/mod/core.html#directory
  <Directory />
    Order Deny,Allow
    Deny from all
  </Directory>

  Alias /greenstone/web "**GSDLHOME**/web"
  <Directory "**GSDLHOME**/web">
    Order Deny,Allow
    **CONNECTPERMISSION** from all
    Allow from 127.0.0.1 **HOST_IP** **HOSTS** localhost
  </Directory>

  Alias /greenstone/tmp "**GSDLHOME**/tmp"
  <Directory "**GSDLHOME**/tmp">
    Order Deny,Allow
    **CONNECTPERMISSION** from all
    Allow from 127.0.0.1 **HOST_IP** **HOSTS** localhost
  </Directory>

  Alias /greenstone/bin/java "**GSDLHOME**/bin/java"
  <Directory "**GSDLHOME**/bin/java">
    Order Deny,Allow
    **CONNECTPERMISSION** from all
    Allow from 127.0.0.1 **HOST_IP** **HOSTS** localhost
  </Directory>

In the above, replace the placeholders accordingly:

  • GSDLHOME: type of the full path to your Greenstone installation folder
  • GSDL_OS_ARCH: the OS of your server machine ("windows", "linux", or for macs "darwin").
  • CONNECTPERMISSION: this can be either "Deny" or "Allow". If you want public access to your web, collect and cgi-bin folders, set it to Allow.
  • HOST_IP: space-separated list of any particular IP addresses that you want to make the Greenstone server accessible from. At a minimum, you'll want to set it to the IP of your server machine itself.
  • HOSTS: space-separated list of particular hostnames of machines that you want to allow access to the Greenstone server accessible At a minimum, set it to the hostname of your server machine itself.
  • COLLECTHOME: this is the full path to where your collect folder is. By default, it is GSDLHOME/collect, unless you've installed it elsewhere.

Useful information

When you've built a collection of documents, you may discover that there appears to be a copy of all these documents in the collection's import, archives and index subfolders and wonder whether Greenstone could really be so inefficient with space as to keep 3 copies of everything. As it happens though, Greenstone uses hard-links both on Linux and Windows, in order to keep just one set of your documents. Then it simply hardlinks to these, instead of making copies. By default, Windows doesn't show you when files on your filesystem are hard-linked. If you choose to install the Windows extension program Link Shell Extension (LSE), it will put red arrows on files that are hard linked.

Known Issues and Patches

Problem: authenticated collections require constant authentication

Thanks to Diego for suggesting this workaround: The GS 2.86 installer does not include key.gdb file, so if you have a collection with authentication you must enter username and password for every page you visit. However, if you copy any key.gdb file it works ok!

Grab this empty key.gdb file and place it in your Greenstone 2 installation's etc folder. Then re-run the Greenstone 2 server, and you will only need to authenticate once.

Malformed UTF-8 character (Fatal error)

If building fails with the log message:

import.pl> Malformed UTF-8 character (UTF-16 surrogate).

Diego Spano explains the problem and work around:

When GS converts a text file like a PDF it takes the text and converts it to html or text (it depends on Windows or Linux). That html is then converted to a Greenstone archive format, a kind of xml file. To convert the file to xml GS has to replace symbols like "<" and many others that are restricted chars for xml format because they are part of the xml syntax. For example, the "<" symbols is converted to "&lt;"

It can be possible that PDF document has something inside that corrupts this process, I mean, some characters inside the PDF can not be converted.

The workaround for this is to modify your Greenstone installation's perllib/unicode.pm file (line 146) from:

} else {
    # error, don't encode anything
    die;
}

to

} else {
    # error, don't encode anything
    $out .= " ";
}

The intention is to replace each "unknown" char with a simple space. This way, the raw text suffers no modifications and can be indexed by GS.

You can also download it from here:

http://trac.greenstone.org/browser/main/trunk/greenstone2/perllib/unicode.pm

Mac OS

Gord Nickerson conducted the following tests on various Mac OS and sent in the results and suggested solutions:

Greenstone-2.86-MacOS-intel-Lions.dmg was downloaded from greenstone.org 
and tested on a 6 core mac pro with 24 gigs of ram running OSX 10.7.5 (Lion) 
and OSX 10.9 (Maverick)

Installer runs and installs successfully but GLI.app executes and terminates. 

Greenstone-2.86-MacOS-intel-Leopards.dmg was tested on the same platform and 
installs and the GLI.app works correctly.

Version 2.85 was also tested and works on 10.7.5 (Lion) but was not tested on 
10.9 (Maverick)

Workaround: use Leopard version (of the 2.86 release on a Mac Lion, rather than 
using the Mountain Lion version of the 2.86 release).

Gord also tried out the Leopard binary of 2.86 on a Mac Leopard 10.5.8 and found that the version of Java supported by that Mac did not work with the version of Java required by the Greenstone binary. The Leopard machine here that we used to generate and test the Leopard binary specifically had Java 1.6 installed as the default Java in order to get Greenstone to work.

- If you have Java 1.6 installed on your Leopard, make it the default Java, for which you need to have admin privileges on the machine. Go to the desktop and in the toplevel menu, navigate to Go > Utilities > Java Preferences. There set the default Java version to Java 6.
- Then try to run the Greenstone 2.86 for Leopards on the 10.5.8 machine.

Older mac that OS X 10.5.8 (Leopard)

When it gets to the final installation screen the following error message is returned:

Install failed. Error running the install, java.lang.UnsupportedClassVersionError: 
Bad version number in .class file. 

User goes to java's site and does do the java verification and it said that Apple supplies 
its own version of Java  

User goes to Apple Support and downloads Java for Mac OS X 10.5 Update 10. 

Tried installing greenstone again, and got the same error message.

workaround: user installs 2.85 version of Greenstone

No action required, information only

Maverick

David Forero on the mailing list had a Mac Maverick (OS X 10.9.1) with Java 1.7.0_51, and noticed that the GS2.86 release for Mountain Lions did not work on his Maverick. We found that in order to get the 2.86 Mac Mountain Lion release to work on the newer Mac OS, Maverick:

Before doing anything, you will need to set up the Security on your Mac Maverick to allow you to run .dmg files downloaded from the internet. Otherwise the Greenstone mac binary will not run. Then:

1. Go to your Greenstone 2.86 installation folder's "lib/darwin" subfolder and move the file "libsqlite3.dylib" up one level (into the "lib" folder).

2. In a fresh Terminal, go into your GS2.86/gli folder. Then try to run
./gli.sh

3. Open up gli/findjava.sh in a text editor

4. Find the line that says
HINT=/System/Library/Frameworks/JavaVM.framework/Versions/CurrentJDK/Home

5. Replace the text after the = sign with the full path to JAVA_HOME. Try:
HINT=/Library/Java/JavaVirtualMachines/jdk1.7.0_51.jdk/Contents/Home/

6. Save the file

7. Run GLI from your Greenstone's gli subfolder with:
./gli.sh

Greenstone applets (Phind, Collage) crash Firefox

See bugzilla report.

If attempting to view a java applet (like Collage or Phind phrase classifiers) crashes Firefox, then make sure you have the Java Applet plugin installed. If it is installed and Firefox is still crashing, then open firefox and visit the page

about:config

Scroll down to the property:

dom.ipc.plugins.java.enabled

Set it to true (rightclick and choose toggle).

Problem opening collection in GLI in Ubuntu

Problem encountered on: Ubuntu 12.04, perhaps it applies to other versions.

The error message that's displayed looks like:

Collection at .../collection_name.col cannot be opened

Solution:

  • Delete "gsdl/perllib/cpan/perl-5.8/XML" and "gsdl/perllib/cpan/perl-5.8/auto" folders
  • In a terminal window, enter:
apt-get install libxml-parser-perl

Thanks to Amos Kujenga and Kurt Mattsson on the mailing list for discovering and documenting both the problem and its solution.

Unable to create new users on Ubuntu

Problem encountered on: Ubuntu 12.04 LTS and 12.05 LTS. May be applicable to other versions too.

Detecting the problem:

  • After installing and setting up Greenstone 2.85 on the Ubuntu 12.05 LTS, cannot add new users through the Admin pages.
  • Look in the apache-httpd/linux/logs/error_log file. Check whether there are error messages about the Greenstone installation missing the etc/key.gdb file
Fri Jul 13 17:50:35 2012] [error] [client 127.0.0.1] database open
failed on: /usr/local/Greenstone/etc/key.gdb, referer: %%http://localhost/dl/library.cgi%%
[Fri Jul 13 17:51:09 2012] [error] [client 127.0.0.1] File does not
exist: /var/www/favicon.ico[
  • Check your Greenstone etc folder to see if the file key.gdb is indeed missing.

Solution: If you have a working installation of Greenstone elsewhere (on another machine), then copy its etc/key.gdb over.

Alternatively,

Thanks to Africa Bwamkuu and Amos Kujenga on the mailing list for reporting and resolving the problem. A more permanent fix seems elusive, since we have not been able to reproduce it on our Ubuntu 12.04 LTS here. The differences in behaviour may possibly be owing to Environment variables.

Getting applets to work

In later versions of Java, unsigned or self-signed java applet jar files are not displayed in Windows browsers. To allow your browser to display them, on Windows, press Start and go to Control Panel > Programs > Java. In the Security tab, press the Edit Site List button and press Add to add the base URL of the Greenstone library server. For example, for the nzdl.org site, you would add http://www.nzdl.org.

If accessing the GLI applet, you would follow the same procedure but add the URL of the remote Greenstone server.

CDS/ISIS plugin failing to work on Mac (Mountain) Lions / 64 bit Macs

The CDS/ISIS plugin now works on 64 bit Linux too. However, at the time of the release it did not work on 64 bit Macs including Mac (Mountain) Lions. There is a patch for this now.

If you're using a Mac (Mountain) Lion GS2.86 binary or if you're unable to explode your ISIS file, or tried to process your ISIS files with the ISISPlugin and your files didn't get processed properly, then you may be dealing with a 64 bit Mac too. To check if your Mac is indeed 64 bit, open an x-term and type uname -m and hit enter. If the response is x86_64, then your machine's 64 bit.

If your Mac is a (Mountain) Lion or 64 bit machine:

  1. Put this in your Greenstone's bin/darwin folder.
  2. Rename the file to IsisGdl, replacing the previous binary that already exists there with this name.
  3. Use an x-term to give this file executable permissions: chmod a+x /path/to/your/GS286/bin/darwin/IsisGdl

GLI Imagemagick issues on Mac OS X Yosemite (10.10.1)

Mac users running v 10.10.1 (OS X Yosemite) may encounter GLI reporting the error message:

Warning. The librarian interface failed to locate an appropriate version of ImageMagick. Images can be included in collections but no image processing functionality will be available, such as converting images to a different type, or generating thumbnails.

Leo Riesthuis on the mailing list found that the ImageMagick Greenstone installs by default does not work on Mac Yosemite. He came up with the following solution to get GLI working for him, which required installing a separate version of ImageMagick:

  1. Install Imagemagick from http://cactuslab.com/imagemagick
  2. ## Set env variable for ImageMagick in /Greenstone/gli/gli.sh
  3. export MAGICK_HOME="/opt/ImageMagick"
  4. rename or remove the old ImageMagick version at <your-Greenstone>/bin/darwin/imagemagick. Or move it out of the way.

Learning to use Greenstone

If your Greenstone is up and running and you're ready to start learning about how to use Greenstone, refer to the Greenstone 2 Tutorial Exercises.

Troubleshooting and other Questions

Please consult the Greenstone FAQ at http://wiki.greenstone.org/wiki/index.php/Greenstone_FAQ to see if any of your questions are answered and for workarounds of known issues. If any issues persist, write to the Greenstone Mailing List.

Updated Translations

Thanks to the following people for new and updated translations since 2.85:

  • Tomáš Fiala for Slovakian translations
  • Kamal Salih Mustafa Khalafala of the University of Khartoum Institute of Environmental Studies Library for Arabic translations, both the work done in 2012 as well as in 2013.
  • Diego Spano for Spanish translations
  • John Rose, Julie Verleyen, Antonin Benoit Diouf, Sandraghassen S. Pillai and Yvan Arnaud for French translations
  • Lavji Zala for Gujarati translations
  • Marcin Karkosz for Polish translations
  • Ata ur Rehman and LiSolutions for Urdu translations
  • Muhammad Shafiq Rana for Urdu translations
  • Dwight Martin for Laotian corrections

And thanks to

  • Sergey Karpov, Zhanat Kulenov and Kalima Tuenbaeva for Kazakh translations contributed in Apr 2009

Important: If you are one of our valued translators and have contributed again since the 2.85 release, and we missed adding your name here due to unfortunate oversight, then your name belongs in the above list. Therefore please contact us and we'll be thrilled to add you in here.

en/release/2.86_release_notes.1441944471.txt.gz · Last modified: 2016/05/17 04:33 (external edit)