Overview of Greenstone

Part of the Greenstone Beginner's Guide

Greenstone is a suite of software for building and distributing digital library collections. It is not a digital library but a tool for building digital libraries. It provides a new way of organizing information and publishing it on the Internet in the form of a fully-searchable, metadata-driven digital library. It is a comprehensive system for constructing and presenting collections of thousands or millions of documents, including text, images, audio and video. It has been developed and distributed in cooperation with UNESCO and the Human Info NGO in Belgium. It is open-source, multilingual software, issued under the terms of the GNU General Public License.

Being open source, Greenstone is readily extensible, and benefits from the inclusion of GNU-licensed modules for full-text retrieval, database management, and text extraction from proprietary document formats. Only through international cooperative efforts will digital library software become sufficiently comprehensive to meet the world's needs with the richness and flexibility that users deserve.

Available Platforms

Greenstone runs on all versions of Windows, Unix/Linux, and Mac OS-X. The main distribution of Greenstone allows for easy installation, and requires no configuration.

Major Versions

There are two major versions of Greenstone: Greenstone2 and Greenstone3. Greenstone3 is a complete redesign and reimplementation of the original Greenstone digital library software (Greenstone2), and is recommended, as it will eventually become the production version of the Digital Library software the group supports. Learn more about the differences between the two versions here.

Much of the documentation is similar or identical for the two major versions. Where the versions diverge, you will find a tabbed section like this:

Simply select the tab corresponding to the version you install.

Documents

The plugins distributed with Greenstone can process a wide variety of document types, including:

  • plain text, Word, and PDF documents
  • HTML pages
  • Powerpoint presentations
  • Excel spreadsheets
  • images
  • MARC records
  • and more!

In addition, new plugins can be written for different document types. Non-textual material can either be linked to textual documents or accompanied by textual descriptions (such as figure captions) to allow full-text searching and browsing.

Unicode, which is a standard scheme for representing the character sets used in the world's languages, is used throughout Greenstone. This allows documents in any language to be processed and displayed in a consistent manner. Collections have been built containing Arabic, Chinese, English, French, Māori and Spanish. In addition, the Greenstone interface is available in all the above languages (and more).

Collections

Collections can be accessed through both searching and browsing. When you create Greenstone collections, you can tailor the searching and browsing facilities.

Distribution

Collections are accessed over the Internet or published, in precisely the same form, on a self-installing Windows CD-ROM. Compression is used to compact the text and indexes.

Additional Resources