More about metadata

How to manually specify filenames in metadata.xml
If you're writing your own metadata.xml files that will specify what metadata is attached to which folders and files, you will need to specify the  element as a regular expression and any filepaths must be in URI format (which uses forward slashes). Because such filepaths represent regular expressions, backslashes can be used to escape special characters, e.g. "\." means the literal full-stop character.

An example of a valid metadata.xml file is:  <!DOCTYPE DirectoryMetadata SYSTEM "http://greenstone.org/dtd/DirectoryMetadata/1.0/DirectoryMetadata.dtd">   pinky/golala/filename1\.txt  Lala    pinky/nono/filename2\.txt  Nono</Metadata> </Description> </FileSet>  pinky/toto/filename3\.txt</FileName>  Toto</Metadata> </Description> </FileSet> </DirectoryMetadata>

Can I get any information about the metadata coverage in my collection?
Metadata coverage statistics can be gathered during collection building by adding the line store_metadata_coverage true to the collection's etc/collect.cfg file. Rebuild the collection (don't need to reimport), then the collection's GDBM database will contain the following information in the 'collection' entry. Examples are from the demo collection.

dls ex
 * Which metadata sets have been used in the collection

<metadatalist-ex>URL <metadatalist-ex>Plugin <metadatalist-ex>Encoding <metadatalist-ex>Language <metadatalist-ex>SourceFile <metadatalist-ex>Source <metadatalist-ex>FileSize <metadatalist-ex>Title <metadatalist-dls>Subject <metadatalist-dls>Language <metadatalist-dls>Keyword <metadatalist-dls>Organization <metadatalist-dls>Title
 * Which elements are present in each metadata set.


 * The frequency of each metadata element.

<metadatafreq-dls-Subject>17 <metadatafreq-dls-Title>11 <metadatafreq-dls-Organization>11 <metadatafreq-dls-Keyword>6 <metadatafreq-dls-Language>11 <metadatafreq-ex-SourceFile>11 <metadatafreq-ex-Plugin>11 <metadatafreq-ex-URL>11 <metadatafreq-ex-Title>11 <metadatafreq-ex-Encoding>11 <metadatafreq-ex-FileSize>11 <metadatafreq-ex-Language>11 <metadatafreq-ex-Source>11

Note, to view all the entries in the GDBM database, run db2txt path-to-collection/index/text/collname.gdb > database.txt