Compiling Greenstone Advanced

This page contains detailed notes and platform specific instructions for compiling Greenstone. Simple compilation instructions may be found here.

Platforms
We have tested Greenstone on the following platforms:
 * GNU/Linux:
 * Debian 3.0 (potato) (i386), gcc 2.95.4
 * Debian sid/unstable (i386), gcc 3.2
 * Debian 2.2 (potato) (i386 and ppc rs/6000) (gsdl 2.38)
 * Red Hat 7.3 (i386), gcc 2.96
 * Slackware 8.0.0 (i386), gcc 2.95.3
 * Slackware 7.1.0 (i386) (gsdl 2.38)
 * Solaris 2.8 (sparc), gcc 2.95.2 and gmake.
 * Solaris 2.6 (sparc) using gcc and gmake. (gsdl 2.38)
 * FreeBSD 4.2 (i386) (gsdl 2.38)
 * Cygwin (minor fiddling needed)
 * Darwin / Mac OS X (G4/ppc7400)
 * Microsoft Visual C++ 4.2 and 6.0

If you would like to add other platforms to this list, or inform us of any portability changes required, send mail to the Greenstone mailing list.

Unix Compilation notes
(we include cygwin here). The "standard" commands of $ ./configure $ make all $ make install should (hopefully) be all that is required. Currently, Greenstone does not honour the `--prefix' flag, but the directories are self-contained, so should be able to be moved after installation.

You will probably need to use GNU make.

Version 2.38 of Greenstone compiles with gcc version 3.0 (as well as earlier versions of gcc). Earlier versions of Greenstone will require earlier versions of the compiler -- we have successfully used versions egcs-2.91.66 and gcc-2.95.

The Greenstone Librarian Interface (GLI) code is written in Java, and compiling it requires a suitable version of the Java Software Development Kit (version 1.4.0 or newer). To compile this source code, run makegli.sh from the gsdl/gli directory.


 * Note! - version 2.41 of Greenstone (released in December 2003) requires slight modification to compile cleanly with gcc version 3.x. The "Isis" package in greenstone's "packages" directory requires 3 files to be changed. You can download the 3 files from: http://www.greenstone.org/tmp/isis-gdl-fixes.zip which contains:

isis-gdl/CRC32.cpp isis-gdl/IsisTypes.h isis-gdl/Master.cpp
 * If you copy these over the 3 files with the same name in gsdl/packages/isis-gdl then it should all compile ok with gcc 3.

The GDBM library and headers are needed - Linux distributions typically come with the library, but may or may not come standard with the header file. Darwin does not come with either. If it is not installed in /usr or /usr/local then you will have to specify where it is by using the configure flag --with-gdbm=/path/to/gdbm/dir. The GDBM library can be downloaded from ftp://ftp.gnu.org/pub/gnu/gdbm/gdbm-1.8.3.tar.gz or a closer mirror.

The following configure flags add extra functionality to Greenstone: --enable-corba Creates a CORBA server as well as the .cgi server. This is currently still developmental, and compilation hasn't been tested on many platforms. A java CORBA client is available from our subversion repository - svn co http://svn.greenstone.org/other-projects/trunk/java-client --with-micodir Use an existing MICO compiler for the CORBA server instead of compiling our included version. --enable-z3950 Enable rudimentary Z39.50 client support in the .cgi server.

Once you have the correct compiler, there's a step by step walkthrough on Installing Greenstone 2 from source for Beginners.

Cygwin
Greenstone does not currently compile under cygwin "out-of-the-box". We had to manually edit some of the Makefiles. More specifically, the packages/mg subtree needed "-ansi" in the CFLAGS, while some parts of the src/mgpp subtree fails with "-ansi", as their version of the standard header files don't include any (eg) POSIX or XPG/OPEN functions that aren't ANSI if -ansi is supplied, even if flags like "_XOPEN_SOURCE" are defined.

Also some of the third-party packages required some manual attention.

Make sure you have the gdbm package installed.

Darwin/Mac OS X
All compilation is currently done through a terminal.

Note that the default filesystem is case-insensitive.

Mac OS X uses a compiler based on gcc version 3, so read the above section on Unix Compilation Notes for changes required to the source code for greenstone version 2.41.

darwin doesn't come with gdbm, so:
 * 1) Download the gdbm source code from ftp.gnu.org into your home folder, as mentioned previously.
 * 2) Unpack it using the command "tar -zxf gdbm-1.8.3.tar.gz" (without quotes)
 * 3) If you used the older gdbm-1.8.0 (instead of 1.8.3), you will need to update some files that can't figure out what the system type is. Eg: "cp /usr/libexec/config.sub /usr/libexec/config.guess gdbm-1.8.3"
 * 4) Make (and install) the library: cd gdbm-1.8.3 && make all install (When we did this, we did not have permission to install in a system directory)

If the "make install" command fails to install libgdbm into the default "/usr/local" folder due to permissions:
 * 1) Remove the dynamic libraries, so that the Greenstone files are only linked using the "static" gdbm libraries using the following command (without the quotes): "rm ~/gdbm-1.8.3/.libs/*.dylib"
 * 2) when you configure Greenstone you should add --with-gdbm=/Users/ /gdbm-1.8.3 (or wherever you installed it to) to the command. Note that this means that the compile option in the CD-ROM distribution Install script will fail, since it will only look in the default locations (/lib, /usr/lib/ and /usr/local/lib)

Also see the OSX Install Notes page.

Issues

 * Due to "upgrades" of the config.guess and config.sub files, version 2.37 of Greenstone might not recognise Mac OS X during the configure. You can either overwrite these files (as above) from /usr/libexec, or try to fake it by adding --host=powerpc-apple-machten  (or similar) to the configure line. This has been resolved in gsdl-2.38.
 * If the make fails when compiling a file called "display.cpp", you need to work around a compiler bug. This only seems to occur when "-O2" is part of the compiler flags, so changing this to "-O1" or removing it from the flags will work. This seems to have been fixed with version 10.1 and later of the developer tools.
 * "There is a known bug in the version of gcc shipped with MacOS X 10.2." Version 2.38 of greenstone fails to build the pdftohtml converter, with the error message FontFile.h:27: storage size of `_ZTI8FontFile' isn't known FontFile.h:46: storage size of `_ZTI13Type1FontFile' isn't known FontFile.h:67: storage size of `_ZTI14Type1CFontFile' isn't known FontFile.h:144: storage size of `_ZTI16TrueTypeFontFile' isn't known The work-around is to remove all lines with "#pragma" from the source-code of the pdftohtml package. See the bug report on pdftohtml's site. This will be fixed in Greenstone version 2.39 and later.

GNU/Linux
As mentioned above, version 2.38 of Greenstone compiles with gcc3. If you are using an earlier version of Greenstone, you will need to make sure you are using an older version of gcc, and not gcc2.96 or gcc3. Red Hat >= 7.0 comes with the newer versions of gcc by default.

You will need the gdbm.h header file. If you don't already have it, it is in the libgdbmg1-dev (debian) or gdbm-devel-1.8.0 (rpm) package.

We have had a couple of reports from SuSE users that one of our third-party packages (wget) fails to configure as it uses GNU msgfmt for translation catalogues - this is resolved by installing the gettext package. This seems to be already installed on other distributions.

Alpha architectures: Well, the good news is that it compiles OK (for versions of gsdl >= 2.37 - earlier versions need updated config.sub files), the bad news is that it fails to build collections. One possibility is that mg (the backend code) doesn't like the 64-bit ints.

FreeBSD
Everything should go smoothly... you might need to install lib gdbm if it is not already installed.

Solaris
We no longer have solaris machines running in our department. However, Greenstone built and ran the last time I could log on (about version 2.33?), and we have reports of people getting current versions working, with minor changes.

Greenstone includes a perl module from CPAN, XML::Parser (+Expat), which has caused problems on some machines during the make. This is because perl uses it's own config file to get compiler settings, etc. We think we have worked around this. If you get compilation problems, you can do one of the following:
 * 1) After doing the toplevel ./configure, edit gsdl/packages/cpan/XML-Parser-2.27/Makefile and XML-Parser-2.27/Expat and change CC = cc</tt> to CC = gcc.</tt> (Assuming you are using gcc). Or:
 * 2) You could install the perl XML::Parser module manually, and comment out or remove any mention of the cpan/XML-Parser-2.27 directory from gsdl/packages/Makefile(.in).

You might need to manually install the gdbm library. In which case, add the --with-gdbm=&lt;gdbm-dir&gt;</tt> option to the configure command.

You probably need to use gcc: $ setenv CC gcc $ setenv LD gcc

You probably need to use GNU's make. Try setting the MAKE variable when you run the configure script, such as: $ MAKE=gmake ./configure [options] or $ ./configure [options] ... $ gmake all

Here is a summary of installing Greenstone 2.70 on Solaris, kindly provided by Courtney Grimland.

Windows (Visual C++)
The third-party packages (pdftohtml, wvWare, rtftohtml, xlhtml, etc) were compiled using cygwin.

64 bit processors
Some notes by David Bainbridge:

Two of the three indexers Greenstone can use (MG, and MGPP) are known to only work on a 32-bit architecture. To get Greenstone 3 to run on a 64-bit machine my approach was to use the "-m32" flag (which gets gcc/g++ to generate code that can run on a 64-bit machine, but uses the data type sizes of a 32-bit architecture). Anything that links to the MG/MGPP libraries needs to have the -m32 bit flag on, which includes GDBM. The necessary mods should already be in the code, so one thing to check is that your Greenstone is compile programs like mg/mgpp and gdbm with this flag set (it might be the test I added to ./configure wasn't general enough). Something else to look at is the version of the compiler you're using. I did this work around a year ago with a gcc 3.x series install. Possible you have a 4.x compiler.

One final thing to mention is Java. Having messed around with -m32 flags, I'm pretty sure the final thing I had to do to get things of the ground was to use a 32-bit version of Java. I know, sounds pretty crazy, but Greenstone 3 uses JNI to interface to the compiled libraries for mg and mgpp, and the installed Java wasn't happy about this at all (some fairly obscure error messages, which I finally tracked down to this). Even Java "32-bit mode" flag wasn't enough. Recompiling all the code again with a 32-bit Java was may way around this.

I'm not particularly satisfied with the approach we've come up with, which is why we haven't advertised it. We'd very much like to support a 100% Java only runtime for Greenstone 3, which means it could be fully 64-bit compliant. This would mean only using Lucene as the indexer and something like Java DBM (rather than GDBM) for the database.

Notes about trying to compile as 64 bits
John has tried once to compile Greenstone on a 64 bit processor (Opteron, Amd64) running 64 Ubuntu. He reported several problems which are listed below. These have yet to be resolved. 1. When trying to build anything using mg (i.e. mg_passes) I'm suffering a seg fault somewhere deep in the malloc code (mallopt). From what I can tell the mg stuff uses some library wrapper (gsdl/packages/mg/lib/memlib) which has some prototypes which override the stdlib.h ones. If I let them do this, I believe malloc might be using 32 bit ints to hold pointers (which are 64 bit under X86_64 architecture). If I prevent memlib overriding by setting the STD_MEM precompiler flag I get a bazillion warnings - and it still seg faults in the same place (again implying a pointer != int problem).

Below is a dump of the error messages I encountered when trying to debug the malloc problem. (gdb) file /var/www/projects/john/gsdl/bin/linux/mg_passes

(gdb) run -D -f /var/www/projects/john/gsdl/collect/demo/building/text/demo -b 12000 -T1 -M 4 -d / < /tmp/output.txt

(gdb) backtrace file_name=0x7fbffffb5d "/var/www/projects/john/gsdl/collect/demo/building/text/demo") at mg_passes.c:336 mg_passes.c:626
 * 1) 0 0x0000002a95c7bab6 in mallopt  from /lib/libc.so.6
 * 2) 1 0x0000002a95c7aa63 in malloc  from /lib/libc.so.6
 * 3) 2 0x0000000000402bf5 in process_text_1
 * 4) 3 0x0000000000401fb1 in driver (in_fd=0, Trace=0x0,
 * 1) 4 0x0000000000402617 in main (argc=11, argv=0x7fbffff918) at

2. When trying to search a collection which was imported/built using mgpp I get this cool error message: Couldn't load index information for "idx" and no results. Strangely enough browsing works just fine. I tracked the problem back to the loading of the index files in (gsdl/src/mgpp/text/IndexData.c) and in particular these lines

// blocked dictionary ... fseek (dictFile, bdh.wblk_idx_start, SEEK_SET); if (!ReadBlockIdx (dictFile, biWords)) { UnloadData ; return false; } ...

This always returns false. So the fseek to the start of the dictionary is failing. Weird eh? Any ideas? Currently I suspect a bogus pointer to int cast somewhere in the mgpp code during import/building is resulting in an equally bogus index file.

3. So... believing I didn't have quite enough problems to sort out I attempted to fix the second problem and thus created another wonderful problem. Whenever I browse to my custom collection I get the "Opps! An error occurred..." page - but with no error message. However if I run the binary from the cgi-bin it works just peachy. It sounds like a permissions problem - but it isn't. I've triple checked all the permissions. Running the plain-jane greenstone library, then going to the custom collection works just fine (bar the searching problem outlined above).