User Tools

Site Tools


en:user_advanced:classifiers

Browsing Classifiers

The information used to support browsing is stored in the collection information database, and is placed there by classifiers that are called during the final phase of buildcol.pl.

Classifiers, like plugins, are specified in a collection's configuration file. For each one there is a line starting with the keyword classify and followed by the name of the classifier and any options it takes. The basic collection configuration file discussed in Section configuration_file includes the line classify AZList—metadata Title, which makes an alphabetic list of titles by taking all those with a Title metadata field, sorting them and splitting them into alphabetic ranges. An example is shown in Figure <imgref figure_azlist_classifier>.

The lines used to specify classifiers in collection configuration files contain a metadata argument that identifies the metadata by which the documents are classified and sorted. Any document in the collection that does not have this metadata defined will be omitted from the classifier (but it is still indexed, and consequently searchable). If no metadata argument is specified, all documents are included in the classifier, in the order in which they are encountered during the building process. This is useful if you want a list of all documents in your collection.

There is a classinfo.pl program that gives you information about any classifier, and the options it provides. These options can also be viewed on the classifiers' individual pages, which can be reached from the list of classifiers.

Collection-specific classifiers can be written, and are stored in the collection's perllib/classify directory. The Development Library has a collection-specific classifier called HDLList, which is a minor variant of AZList.

How classifiers work

Classifiers are Perl objects, derived from BasClas.pm, and are stored in the perllib/classify directory. They are used when the collection is built. When they are executed, the following four steps occur.

  1. The new method creates the classifier object.
  2. The init method initialises the object with parameters such as metadata type, button name and sort criterion.
  3. The classify method is invoked once for each document, and stores information about the classification made within the classifier object.
  4. The get_classify_info method returns the locally stored classification information to the build process, which it then writes to the collection information database for use when the collection is displayed at runtime.

The classify method retrieves each document's OID, the metadata value on which the document is to be classified, and, where necessary, the metadata value on which the documents are to be sorted. The get_classify_info method performs all sorting and classifier-specific processing. For example, in the case of the AZList classifier, it splits the list into ranges.

The build process initialises the classifiers as soon as the builder object is created. Classifications are created during the build phase, when the information database is created, by classify.pm, which resides in Greenstone's perllib directory.

Dynamic classifiers

Currently only available for Greenstone 2. If a collection uses sqlite or MSSQL as the collection database, then it can provide dynamic classifiers. These are generated at runtime, so don't need to be rebuilt everytime documents are added. kjdon-TODO finish this section.

en/user_advanced/classifiers.txt · Last modified: 2023/03/13 01:46 by 127.0.0.1