Differences

This shows you the differences between two versions of the page.

Link to this comparison view

en:user_advanced:classifiers [2014/11/03 10:36] (current)
Line 1: Line 1:
  
 +====== Browsing Classifiers =======
 +
 +The information used to support browsing is stored in the collection information ​
 +database, and is placed there by classifiers that are called during the final phase of 
 +''​buildcol.pl''​.
 +
 +<!-- id:567 -->​Classifiers,​ like plugins, are specified in a collection'​s configuration file. 
 +For each one there is a line starting with the keyword //​classify//​ and followed by the name 
 +of the classifier and any options it takes. The basic collection configuration file discussed ​
 +in Section [[#​configuration_file|configuration_file]] includes the line //​classify ​
 +AZList—metadata Title//, which makes an alphabetic list of titles by taking all those with 
 +a //Title// metadata field, sorting them and splitting them into alphabetic ranges. ​
 +An example is shown in Figure <imgref figure_azlist_classifier>​.
 +
 +
 +<!-- id:574 -->The lines used to specify classifiers in collection configuration ​
 +files contain a //​metadata//​ argument that identifies the metadata by which the 
 +documents are classified and sorted. Any document in the collection that does not 
 +have this metadata defined will be omitted from the classifier ​
 +(but it is still indexed, and consequently searchable). ​
 +If no //​metadata//​ argument is specified, all documents are included in the classifier, ​
 +in the order in which they are encountered during the building process. This is useful ​
 +if you want a list of all documents in your collection.
 +
 +There is a 
 +//​classinfo.pl//​ program that gives you information about any classifier, ​
 +and the options it provides. These options can also be viewed on the classifiers'​ individual
 +pages, which can be reached from the [[en:​classifier:​index|list of classifiers]].
 +
 +
 +<!-- id:609 -->​Collection-specific classifiers can be written, ​
 +and are stored in the collection'​s //​perllib/​classify//​ directory. ​
 +The Development Library has a collection-specific classifier called ​
 +//​HDLList//,​ which is a minor variant of //AZList//.
 +
 +
 +===== <!-- id:625 -->How classifiers work =====
 +
 +<!-- id:626 -->​Classifiers are Perl objects, derived from //​BasClas.pm//,​ and are stored in the //​perllib/​classify//​ directory. They are used when the collection is built. When they are executed, the following four steps occur.
 +
 +  - <!-- id:627 -->The //new// method creates the classifier object.
 +  - <!-- id:628 -->The //init// method initialises the object with parameters such as metadata type, button name and sort criterion.
 +  - <!-- id:629 -->The //​classify//​ method is invoked once for each document, and stores information about the classification made within the classifier object.
 +  - <!-- id:630 -->The //​get_classify_info//​ method returns the locally stored classification information to the build process, which it then writes to the collection information database for use when the collection is displayed at runtime.
 +
 +<!-- id:631 -->The //​classify//​ method retrieves each document'​s OID, the metadata value on which the document is to be classified, and, where necessary, the metadata value on which the documents are to be sorted. The //​get_classify_info//​ method performs all sorting and classifier-specific processing. For example, in the case of the //AZList// classifier, it splits the list into ranges.
 +
 +<!-- id:632 -->The build process initialises the classifiers as soon as the //builder// object is created. Classifications are created during the build phase, when the information database is created, by //​classify.pm//,​ which resides in Greenstone'​s //perllib// directory.
 +
 +===== Dynamic classifiers =====
 +
 +Currently only available for Greenstone 2. If a collection uses sqlite or MSSQL as the collection database, then it can provide dynamic classifiers. These are generated at runtime, so don't need to be rebuilt everytime documents are added. **kjdon-TODO** finish this section.