Explaining Plugins

[This was written for 2.80 and earlier. Its a little bit out of date for 2.81]

An outline of program flow when using import.pl for developers writing their own plugins:

import.pl calls the methods begin, read then end.

This starts at the import directory.

RecPlug handles directories, and will look through a directory to see what files are there. These files get passed to the plugin pipeline, first using metadata_read, then read.


 * The metadata_read method only gets called from RecPlug. (and MetadataCSVPlug)

All plugins inherit from BasPlug.
 * BasPlug inplements the metadata_read and read methods.
 * BasPlug read calls the process method.

Most plugins call the BasPLug read method, then do the format specific stuff using their own process method.
 * Some plugins override read.

Plugins can implement either read or process (or both). (Note to self - give examples)

Types of Plugin
There are two types, metadata and document plugins.
 * no distinction currently made between them by the system
 * All plugins process files, some
 * group several files into one document,
 * some split one file into several documents, and
 * some have a one to one mapping.


 * Plugins that split files up generally inherit from splitplug.
 * Plugins that process XML files generally inherit from XMLPlug. (They don't need to, it just avoids rewriting all the same code.)

Operational summaries of example plugins
For basic information about plugins see Plugins.

Another good section is http://greenstone.sourceforge.net/wiki/index.php/More_about_plugins#How_do_I_get_my_XML_files_into_Greenstone.3F

MetadataCSVPlugin
 * takes comma separated (.csv) files, extracts metadata (using the metadata_read method)
 * assigns metadata to the documents which are then processed by their normal plugin.
 * The first line is a list of metadata names
 * subsequent lines, one per record, contain the values.
 * Requires a filename field which contains the file name of the document to which the record metadata will be assigned.

So the contents of a csv file containing 2 records would look as follows (the first line contains the fieldnames): Filename,ex.dc.title,ex.dc.subject file1.jpg,example title,example subject file2.jpg,example title2,example subject2 For example: Filename,ex.dc.Description,ex.dc.Creator,ex.dc.Subject ieee.png,copyright policy of IEEE,Jane Smith,copyright university-coauthorship.png, locations of international co-authors, University of Waikato, authorship

will appear in GLI's Enrich pane (with the ex.* prefix), but will be ineditable. If you wish to edit them, you need to edit them in the source CSV file itself and reimport the file.
 * Use the ex.* prefix before the namespaced metadata names, since Greenstone treats them as metadata it has extracted. This metadata
 * If you experience any difficulties when building, try moving the MetadataCSVPlugin above the ExcelPlugin in the Document Plugins list.
 * Further information on the MetadataCSVPlugin can be found on the http://wiki.greenstone.org/wiki/index.php/MetadataCSVPlug_notes page.

ImagePlugin
 * uses the imagemagic utilities to
 * create derivatives(thumbnail images) and,
 * extract image metadata (width height format)
 * ImagePlug can easily be extended to extract more extensive image metadata if required

ReferPlugin
 * takes Refer format Bibliographies reads them in (using the process method)
 * assigns metadata and text with the add_utf8_metadata or add_metadata methods
 * assigns text with the add_text

NOTE on methods- order called

 * metadata_read: first to be called - usually by RecPlug - but also by MetadataCSVPlug
 * in RecPlug greenstone metadata.xml files are read by the metadata_read method
 * in MetadataCSVPlug a .csv text file with the first line containing field names is read by metadata_read
 * read: called after metadata read
 * process: called last?

Adding metadata
 * add_utf8_metadata adds metadata that is already in utf8
 * add_metadata converts to utf8 before adding metadata that is not already in utf8

- Thanks to Katherine Don for this text Which I have only edited slightly.