====== Plugins ====== When building collections, Greenstone processes each different format of source document by seeking a “plugin” that can deal with that particular format. Plugins are specified in the collection configuration file. Greenstone generally uses the filename to determine document formats—for example, ''foo.txt'' is processed as a text file, ''foo.html'' as html, and ''foo.doc'' as a Word file. Plugins parse the imported documents and extract metadata from them. For example, the HTMLplugin converts html pages to the Greenstone Archive Format and extracts metadata which is explicit in the document format—such as titles, enclosed by //// tags. While all plugins process file, * some group several files into one document, * some split one file into several documents—also called [[en:filetype:metadata_database_files#Exploding Metadata Files|'exploding']]] and * some have a one to one mapping. Greenstone includes a wide array of plugins, however, if you need to process document formats not handled by existing plugins; format documents in some special way; or extract a new kind of metadata, it is possible for you to develop new plugins. ===== Managing Plugins in the GLI ===== Plugins can be managed from the **Document Plugins** section of the **Design panel**. When you create a collection based on "New Collection", the //Assigned Plugins// list will by default include a list of the commonly used plugins (e.g. HTMLPlugin, WordPlugin, PDFPlugin). If your collection will not include any document types that are processed by these plugins, they can be removed (by selecting the plugin and clicking the **Remove Plugin** button). For instance, if there are no PDF's in your collection, you can remove the PDFPlugin. However, **GreenstoneXMLPlugin** is a special plugin that should not be removed, unless you are changing the archive format. If you are in Expert mode, you will also see three plugins at the bottom of the list: * **MetadataXMLPlugin** * **ArchivesInfPlugin** * **DirectoryPlugin** which can be configured, but cannot be removed. Plugins are processed in the order they appear in the list. So, if a document can be processed by more than one plugin in the Assigned Plugins list, it will be processed by the first one. ===== Plugins on the commandline ===== To find more about any plugin, just type //pluginfo.pl plugin-name// at the command prompt. (You need to invoke the appropriate //setup// script first, if you haven't already, and on Windows you need to type //perl —S pluginfo.pl plugin-name// if your environment is not set up to associate files ending in //.pl// as Perl executables). This displays information about the plugin on the screen—what plugin-specific options it takes, and what general options are allowed. Run the pluginfo.pl command on the plugin name after setting up your environment for Greenstone. For example: perl -S pluginfo.pl PDFPlugin ===== Additional Resources ===== * Greenstone includes a wide array of plugins, which are [[en:plugin:index|listed in full here]]. * Visit the [[en:user:document_types|document types page]] to determine which plugin(s) are used for the types of documents in your collection. * There is a [[en:developer:plugins|developer's plugin]] page, which provides technical information about plugins and how they work.