Greenstone tutorial exercise

Back to wiki
Back to index
Prerequisite: A large collection of HTML files—Tudor
Devised for Greenstone version: 2.60|3.06
Modified for Greenstone version: 2.87|3.08

Formatting the HTML collection—Tudor

  1. Open up your tudor collection, go to the Format panel (by clicking on its tab) and select Format Features from the left-hand list. Leave the editing controls at their default value, so that Choose Feature displays All Features and VList is selected as the Affected Component. The text in the HTML Format String box reads as follows:

    <td valign=top>[link][icon][/link]</td>
    <td valign=top>[ex.srclink]{Or}{[ex.thumbicon],[ex.srcicon]} [ex./srclink]</td>
    <td valign=top>[highlight]
    {Or}{[dc.Title],[exp.Title],[ex.Title],Untitled}
    [/highlight]{If}{[ex.Source],<br><i>([ex.Source])</i>}</td>

    This displays something that looks like this:

    A discussion of question five from Tudor Quiz: Henry VIII
    (quizstuff.html)

    for a particular document whose Title metadata is A discussion of question five from Tudor Quiz: Henry VIII and whose Source metadata is quizstuff.html.

    This format appears in the search results list, in the Titles list, and also when you get down to individual documents in the Subjects hierarchy. This is Greenstone's default format statement.

Greenstone's default format statement is complex because it is designed to produce something reasonable under almost any conditions, and also because for practical reasons it needs to be backwards compatible with legacy collections.

  1. Delete the contents of the HTML Format String box and replace it with this simpler version:

    <td>[link][icon][/link]</td>
    <td>[ex.Title]<br>
        <i>([ex.Source])</i>
    </td>

    Preview the result (you don't need to build the collection, because changes to format statements take effect immediately). Look at some search results and at the Titles list. They are just the same as before! Under most circumstances this far simpler format statement is entirely equivalent to Greenstone's more complex default.

    But there's a problem. Beside the bookshelves in the Subjects browser, beneath the subject appears a mysterious "()". What is printed for these bookshelves is governed by the same format statement, and though bookshelf nodes of the hierarchy have associated Title metadata—their title is the name of the metadata value associated with that bookshelf—they do not have ex.Source metadata, so it comes out blank.

  1. In the Format Features section of the Format panel, the Choose Feature menu (just above Affected Component menu) displays All Features. That implies that the same format is used for the search results, titles, and all nodes in the subject hierarchy—including internal nodes (that is, bookshelves). The Choose Feature menu can be used to restrict a format statement to a specific one of these lists. We will override this format statement for the hierarchical subject classifier. In the Choose Feature menu, scroll down to the item that says

    CL2: Hierarchy -metadata dc.Subject and Keywords

    and select it. This is the format statement that affects the second classifier (i.e., "CL2"), which is a Hierarchy classifier based on dc.Subject and Keywords metadata.

    Click <Add Format> to add this format statement to the collection.

    Edit the HTML Format String box below to read

    <td>[link][icon][/link]</td>
    <td>[ex.Title]</td>

  1. Preview the Subjects list in the collection. First, the offending "()" has disappeared from the bookshelves. Second, when you get down to a list of documents in the subject hierarchy, the filename does not appear beside the title, because ex.Source is not specified in the format statement and this format statement applies to all nodes in the subject classifier. Note that the search results and titles lists have not changed: they still display the filename underneath the title.

  1. Let's change the search results format so that dc.Subject and Keywords metadata is displayed here instead of the filename. In the Choose Feature menu (under Format Features on the Format panel), scroll down to the item Search and select it. Click <Add Format> to add this format statement to the collection. Change the HTML Format String box below to read

    Replace the line:

    <td>[link][icon][/link]</td>
    <td>[ex.Title]<br>
        [dc.Subject]
    </td>

  1. To insert the [dc.Subject], position the cursor at the appropriate point and either type it in, or select it from the Insert Variable... drop down menu. This menu shows many of the things that you can put in square brackets in the format statement.

  1. Preview the collection. Documents in the search results list will be displayed like this:

    A discussion of question five from Tudor Quiz: Henry VIII
    Tudor period|Others
    (The vertical bar appears because this dc.Subject and Keywords metadata is hierarchical metadata. Unfortunately there is no easy way to get at individual components of the hierarchy. For most metadata, such as title and author, this isn't a problem.)

  1. Finally, let's return to the Subjects hierarchy and learn how to modify the bookshelves. In the Choose Feature menu, re-select the item

    CL2: Hierarchy -metadata dc.Subject and Keywords

    Edit the HTML Format String box below to read

    <td>[link][icon][/link]</td>
    <td>{If}{[numleafdocs],<b>Bookshelf title:</b> [ex.Title],
             <b>Title:</b> [ex.Title]}
    </td>

    Again, you can insert the items in square brackets by selecting them from the Insert Variable... drop down box.

    The If statement tests the value of the variable numleafdocs. This variable is only set for internal nodes of the hierarchy, i.e. bookshelves, and gives the number of documents below that node. If it is set we take the first branch, otherwise we take the second. Commas are used to separate the branches. The curly brackets serve to indicate that the If is special—otherwise the word "If" itself would be output.

  1. Preview the collection and examine the subject hierarchy again to see the effect of your changes. Bookshelves should say Bookshelf title: and then the title, while documents will display Title: and the title. Note that the number of documents in the bookshelf is not displayed: we are using [numleafdocs] to test what kind of item in the list we are at, but we are not displaying it.


Copyright © 2005-2016 by the New Zealand Digital Library Project at the University of Waikato, New Zealand
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled “GNU Free Documentation License.”