====== Plugins ======
When building collections, Greenstone processes each different format of source
document by seeking a “plugin” that can deal with that particular format.
Plugins are specified in the collection configuration file. Greenstone
generally uses the filename to determine document formats—for example,
''foo.txt'' is processed as a text file, ''foo.html'' as html, and ''foo.doc'' as a Word file.
Plugins parse the imported documents and extract metadata from them.
For example, the HTMLplugin converts html pages to the Greenstone Archive Format
and extracts metadata which is explicit in the document format—such as titles,
enclosed by //
// tags.
While all plugins process file,
* some group several files into one document,
* some split one file into several documents—also called [[en:filetype:metadata_database_files#Exploding Metadata Files|'exploding']]] and
* some have a one to one mapping.
Greenstone includes a wide array of plugins, however, if you need to process document
formats not handled by existing plugins; format documents in some special way;
or extract a new kind of metadata, it is possible for you to develop new plugins.
===== Managing Plugins in the GLI =====
Plugins can be managed from the **Document Plugins** section of the **Design panel**. When
you create a collection based on "New Collection", the //Assigned Plugins// list will by default
include a list of the commonly used plugins (e.g. HTMLPlugin, WordPlugin, PDFPlugin). If your collection
will not include any document types that are processed by these plugins, they can be removed (by selecting
the plugin and clicking the **Remove Plugin** button). For instance, if there are no PDF's in your
collection, you can remove the PDFPlugin. However, **GreenstoneXMLPlugin** is a special plugin that
should not be removed, unless you are changing the archive format.
If you are in Expert mode, you will also see three plugins at the bottom of the list:
* **MetadataXMLPlugin**
* **ArchivesInfPlugin**
* **DirectoryPlugin**
which can be configured, but cannot be removed.
Plugins are processed in the order they appear in the list. So, if a document can be
processed by more than one plugin in the Assigned Plugins list, it will be processed by the
first one.
===== Plugins on the commandline =====
To find more about any plugin,
just type //pluginfo.pl plugin-name// at the command prompt.
(You need to invoke the appropriate //setup// script first,
if you haven't already, and on Windows you need to type
//perl —S pluginfo.pl plugin-name// if your environment
is not set up to associate files ending in //.pl// as Perl executables).
This displays information about the plugin on the screen—what plugin-specific
options it takes, and what general options are allowed.
Run the pluginfo.pl command on the plugin name after setting up
your environment for Greenstone. For example:
perl -S pluginfo.pl PDFPlugin
===== Additional Resources =====
* Greenstone includes a wide array of plugins, which are [[en:plugin:index|listed in full here]].
* Visit the [[en:user:document_types|document types page]] to determine which plugin(s) are used for the types of documents in your collection.
* There is a [[en:developer:plugins|developer's plugin]] page, which provides technical information about plugins and how they work.