This version (2014/04/14 11:52) is a draft.
Approvals: 0/1

Enriching Your Collection with Metadata

Having gathered several files into the collection, you can now enrich them with additional information called "metadata".

The Enrich Panel

Use the Enrich panel to assign metadata to the documents in the collection. Metadata is data about data – typically title, author, creation date, and so on. Each metadata item has two parts: Element tells what kind of item it is (such as author), and Value gives the value of that metadata element (such as the author's name).

On the left of the Enrich view is the Collection Tree. All the right-click functionality that was available for the Collection Tree in the Gather panel is available here too. To the right is the Metadata Table, which shows metadata for any selected files or folders in the Collection Tree. Columns are named in black at the top, and can be resized by dragging the separating line. If several files or folders are selected, black text indicates that the value is common to all of the selected items, while grey text indicates that it is not. Editing grey values will only affect those documents with that metadata. Any new metadata values entered will be added to all selected items.

A folder icon may appear beside some metadata entries. This indicates that the values are inherited from a parent (or ancestor) folder. Inherited metadata cannot be edited or removed, only appended to. Click on the folder icon to go immediately to the folder where the metadata is assigned.

Clicking on a metadata element in the table will display the existing values for that element in the Existing values area below the table. This "Value Tree" expands and collapses. Usually it is a list that shows all values entered previously for the selected element. Clicking an entry automatically places it into the value field. Conversely, typing in the value field selects the Value Tree entry that starts with the characters you have typed. Pressing [Tab] auto-completes the typing with the selected value.

Metadata values can be organized into a hierarchy. This is shown in the Value Tree using folders for internal levels. Hierarchical values can be entered using the character "|" to separate the levels. For example, "Cards|Red|Diamonds|Seven" might be used in a hierarchy that represents a pack of playing cards. This enables values to be grouped together. Groups can also be assigned as metadata to files.

Metadata Sets

All metadata fields in Greenstone belong to a metadata set, which is simply a pre-defined collection of metadata fields. Because sets will often have metadata fields with the same name (for instance, most sets will have a 'Title' field), namespaces are used to distinguish between metadata from different sets. For instance, all metadata fields in Dublin Core are preceded by dc. (dc.Title, dc.Creator, etc.). Metadata sets are stored in the Librarian Interface's metadata folder and have the suffix ".mds".

The default metadata sets for new collections are Dublin Core (dc), the Greenstone Metadata Sset (gs), and the Extracted Greenstone Metadata Set (ex). The Extracted set is unique because it contains metadata automatically generated during the collection building process and cannot be edited. Metadata values in this set cannot be modified (as it is extracted from the documents themselves), and metadata fields in the extracted set can be referred to without a namespace (so referencing Title is the same as referencing ex.Title).

You can change which metadata sets are used in a collection by clicking the Manage Metadata Sets button underneath the Collection Tree in the Enrich view. This brings up a new window for managing the collection's metadata sets.

The Current Sets list shows you what sets are currently used by the collection.

To use another metadata set with the loaded collection, click "Add…". A popup window shows you the default metadata sets that GLI knows about. To add one of these, select it from the list and click "Add". If you have defined your own metadata set, you can use the "Browse" button to locate the file on your file system.

To create a new metadata set, click "New…". This will launch the Greenstone Editor for Metadata Sets (GEMS). You can also edit existing metadata sets in GEMS. Clicking the "Edit" button launches GEMS with the specified metadata set open. Once you have finished editing the set (as described above), save it (FileSave) and close GEMS.

Sometimes two metadata sets may have the same namespace, for example, Dublin Core and Qualified Dublin Core both use the namespace "dc". Such sets cannot be used in the collection at the same time. If you try to add a set with a namespace already used by the collection, a warning will be shown. If you go ahead, the existing set will be removed and the new one added. Any assigned metadata values will be transferred to the new set providing those elements still exist.

If a collection no longer needs a metadata set, select it and press "Remove". If you have assigned any metadata to its elements you will be asked how to deal with this metadata when you next open the collection.

For a list of the metadata sets that come with Greenstone, as well as information on how to create/add new metadata sets, visit the metadata sets page.

Adding metadata

To add metadata to a document, first select the file from the Collection file tree on the left. The action causes any metadata previously assigned to this file to appear in the table at the right.

Next select the metadata element you want to add by clicking its row in the table.

Type the value into the value field. Use the "|" character to add structure, as described in The Enrich Panel. Pressing the [Up] or [Down] arrow keys will save the metadata value and move the selection appropriately. Pressing [Enter] will save the metadata value and create a new empty entry for the metadata element, allowing you to assign multiple values to a metadata element.

You can also add metadata to a folder, or to several multiply selected files at once. It is added to all files within the folder or selection, and to child folders. Keep in mind that if you assign metadata to a folder, any new files in it automatically inherit the folder's values.

Adding Previously Defined Metadata

To add metadata that has an existing value, first select the file, then select the metadata element that you are assigning to, then select the required value from the value tree, expanding hierarchy folders as necessary. The value of the selected entry automatically appears in the metadata field (alternatively, use the value tree's auto-select and auto-complete features).

The process of adding metadata with already-existing values to folders or multiple files is just the same.

Editing or Removing Metadata

To edit or remove a piece of metadata, first select the appropriate file, and then the metadata value from the table. Edit the value field, deleting all text if you wish to remove the metadata.

The process is the same when updating a folder with child folders or multiple files, but you can only update metadata that is common to all files/folders selected.

The value tree shows all currently assigned values as well as previous values for the current session, so changed or deleted values will remain in the tree. Closing the collection and then re-opening it will remove all values that are no longer assigned.

Reviewing Assigned Metadata

Sometimes you need to see the metadata assigned to many files at once – for instance, to determine how many files are left to work on, or to get some idea of the spread of dates.

Select the files in the Collection Tree you wish to examine, then right-click and choose "Assigned Metadata…". A window called "All Metadata", dominated by a large table with many columns, appears. The first column shows file names; the rows show all metadata values assigned to those files.

Drawing the table can take some time if many files are selected. You can continue to use the Librarian Interface while the "All Metadata" window is open.

When it gets too large, you can filter the "All Metadata" table by applying filters to the columns. As new filters are added, only those rows that match them remain visible. To set, modify or clear a filter, click on the "funnel" icon at the top of a column. You are prompted for information about the filter. Once a filter is set, the column header changes colour.

The filter prompt has a "Simple" and an "Advanced" tab. The Simple version filters columns so that they only show rows that contain a certain metadata value ("*" matches all values). You can select metadata values from the pull-down list. The Advanced version allows different matching operations: must start with, does not contain, alphabetically less than and is equal to. The value to be matched can be edited to be any string (including "*"), and you can choose whether the matching should be case insensitive. Finally, you can specify a second matching condition that you can use to specify a range of values (by selecting AND) or alternative values (by selecting OR). Below this area is a box that allows you to change the sort order (ascending or descending). Once you have finished, click "Set Filter" to apply the new filter to the column. Click "Clear Filter" to remove a current filter. Note that the filter details are retained even when the filter is cleared.

For example, to sort the "All Metadata" table, choose a column, select the default filter setting (a Simple filter on "*"), and choose ascending or descending ordering.

Importing Previously Assigned Metadata

This section describes how to import previously assigned metadata: metadata assigned to documents before they were added to the collection.

If a document has some assigned metadata in a form recognized by the Librarian Interface (i.e. in a metadata.xml file in the same folder as the document) – for example, when you choose documents from an existing Greenstone collection – it is imported automatically when you add the document to a (new) collection. Greenstone will recognize the metadata.xml (which does not have to be added to the collection), and automatically attempt to import the metadata.

To import the metadata, the metadata must be mapped to the metadata sets available in the collection. The Librarian Interface prompts for the necessary information. The prompt gives brief instructions and then shows the name of the metadata element that is being imported, just as it appears in the source file. This field cannot be edited or changed. Next you choose what metadata set the new element should map to, and then the appropriate metadata element in that set. The system automatically selects the closest match, in terms of set and element, for the new metadata.

Having checked the mapping, you can choose "Add" to add the new metadata element to the chosen metadata set. (This is only enabled if there is no element of the same name within the chosen set.) "Merge" maps the new element to the one chosen by the user. Finally, "Ignore" does not import any metadata with this element name. Once you have specified how to import a certain piece of metadata, the mapping information is retained for the collection's lifetime.