This version (2014/11/03 12:44) is a draft.
Approvals: 0/1

Fedora Generic Search Installation

Fedora Generic Search (Fedora GSearch) provides full-text indexing and searching for Fedora. This functionality is not offered by Fedora out-of-the-box, which is why it's a good idea to install Fedora GSearch alongside it.

In this document,

  • $FEDORA_HOME refers to the full path to your Fedora installation directory.
  • You need to have Fedora installed in order to install Fedora Generic Search.

For a walkthrough on installing Fedora (and for links to the official Fedora instructions for installation) see Fedora.

This document describes two ways in which you can install Fedora Generic Search:

  • By downloading Muradora's version of the fedoragsearch.war file and then using the GSearchInstaller application, or
  • by doing the same steps manually.

See also the final section containing Important Notes.

Using the GSearchInstaller

You will need Java 1.5 or higher, which would also have been required to install Fedora itself.

  • Download the GSearchInstaller.jar from the same place where you got the GS3 Web Services Demo-Client (GSearchInstaller.jar is part of the GS3 Web Services Demo-Client distribution file).
  • In a Linux x-term or a Windows DOS prompt, go to the location where your GSearchInstaller.jar is located, and type
java -jar GSearchInstaller.jar

This will launch the installer's opening dialog.

(It is recommended that Windows users don't just double-click on the jar file as the application's output will indicate whether and where anything failed during installation. This output is sent to the DOS console which won't be there if you run the jar-file by double-clicking.)

  • Type in your Fedora server host, port, username and password. Then choose a name for the Fedora GSearch index and repository. (By default, the GS3 Web Services Demo-Client application expects the index name to be FedoraIndex. If you choose something other than FedoraIndex as the index name, follow the steps in [#C Important Note #2].) Finally, browse to the location of your Muradora fedorgsearch.war file.

Note: GSearchInstaller can also be run from the command-line, by passing command-line arguments to GSearchInstaller.jar. To find out what the accepted arguments are, run it with the -help flag:

java -jar GSearchInstaller.jar -help
  • It will take a little while to run, and if nothing went wrong, it will have installed Fedora Generic Search and indexed the [8RunFedoraLibrarianInterface.html Greenstone documents that were ingested into Fedora using FLI].

If you're running on Windows, GSearchInstaller will sadly open quite a number of DOS consoles and leave them open. You'll have to manually close each of them (for instance, by typing exit in them).

Manual Installation

  • The detailed installation instructions which I followed were written by the Muradora team.
  • Next to the steps above, you will need to make some more changes in order to enable full-text indexing and searching of the Greenstone documents ingested into the Fedora repository.
  • Open the file:
$FEDORAHOME/tomcat/webapps/fedoragsearch/WEB-INF/classes/config/index/demoFoxmlToLucene.xslt

At present, Muradora's file allows all digital objects in your Fedora repository to be indexed and searchable. If you're alright with indexing just the Greenstone documents stored in Fedora's repository, then we'll add the following to the above file:

	<xsl:if test="starts-with($PID,'greenstone')">
	...
	</xsl:if>

So that now the file contains:

<xsl:if test="foxml:digitalObject/foxml:objectProperties/foxml:property[@NAME='http://www.w3.org/1999/02/22-rdf-syntax-ns#type' and @VALUE='FedoraObject']">
  <b><xsl:if test="starts-with($PID,'greenstone')"> // ADDED THIS</b>
     <!--    filter out annotations: -->
     <!--
     	<xsl:choose>
       	<xsl:when test="foxml:digitalObject/foxml:datastream/foxml:datastreamVersion[last()]/foxml:xmlContent/rdf:RDF/rdf:Description/rel:isMemberOf[@rdf:resource='info:fedora/mura:annotations']">
        </xsl:when> // CORRECTED typo in this comment
        <xsl:otherwise>
     -->
    <xsl:apply-templates mode="activeDemoFedoraObject"/>
     <!--
        </xsl:otherwise>
        </xsl:choose>
     -->
  <b></xsl:if> // ADDED THIS</b>
</xsl:if>

  • In the file
$FEDORAHOME/tomcat/webapps/fedoragsearch/WEB-INF/classes/config/index/index.properties

we're going to add fulltext (and label) to the list of fields that are indexed and are therefore searchable. We do this by appending

ds.fulltext ds.label

So that it now becomes:

fgsindex.defaultQueryFields = dc.description dc.title dc.creator dc.identifier ds.fulltext ds.label

The field ds.label and a document's full-text can now be searched and will appear in the search results. (Only a customisable snippet-size of a document's full-text will be present in the results of a search.)

Important notes

  • Once installed, the FedoraGSearch rest url will by default be at:
http://localhost:8080/fedoragsearch/rest

If you visit this page, there are links to doing searches and browsing the Fedora repository's full-text contexts.

  • Make sure that upon installing Fedora Generic Search (whether using GSearchInstaller or installing manually), you either:
  • choose to call your index FedoraIndex since this is what the demo-client expects by default, or
  • create a file called gs3democlient.properties in the same folder as where the demo-client executable is, unless such a file already exists. Then in that properties file, write down the name of your index for the property gsearch.indexName as follows:
gsearch.indexName=your-fedora-index-name

If you had to create the properties file, you'd have to create this property as well. But if the file already existed, then the property gsearch.indexName would have been in there and you need only adjust the index name to make sure it is set to to your own.

  • IMPORTANT: Fedora Generic Search only supports indexing and searching Fedora Digital Objects (Fedora repository contents) whose MIME-type is one of
    • text/html
    • text/plain
    • application/pdf

Text/xml is not accepted. (Therefore Greenstone documents exported to FedoraMETS and ingested into Fedora must be of one of the above MIME-types, else they will not be indexed by Fedora Generic Search.)

At present, FLI sets the content-type of the Fedora METS documents it exports to being text/xml. Therefore these documents don't yet get indexed by Fedora Generic Search. In order to do that, you need to use Fedora's Admin-Client application to change the content-type for the documents FLI puts into Fedora from text/xml to text/html.