This page is in the 'old' namespace, and was imported from our previous wiki. We recommend checking for more up-to-date information using the search box.
If you're writing your own metadata.xml files that will specify what metadata is attached to which folders and files, you will need to specify the <FileName> element as a regular expression and any filepaths must be in URI format (which uses forward slashes). Because such filepaths represent regular expressions, backslashes can be used to escape special characters, e.g. "\." means the literal full-stop character.
An example of a valid metadata.xml file is:
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE DirectoryMetadata SYSTEM "http://greenstone.org/dtd/DirectoryMetadata/1.0/DirectoryMetadata.dtd"> <DirectoryMetadata> <FileSet> <FileName>pinky/golala/filename1\.txt</FileName> <Description> <Metadata name="dc.Title">Lala</Metadata> </Description> </FileSet> <FileSet> <FileName>pinky/nono/filename2\.txt</FileName> <Description> <Metadata name="dc.Title">Nono</Metadata> </Description> </FileSet> <FileSet> <FileName>pinky/toto/filename3\.txt</FileName> <Description> <Metadata name="dc.Title">Toto</Metadata> </Description> </FileSet> </DirectoryMetadata>
Metadata coverage statistics can be gathered during collection building by adding the line store_metadata_coverage true to the collection's etc/collect.cfg file. Rebuild the collection (don't need to reimport), then the collection's GDBM database will contain the following information in the 'collection' entry. Examples are from the demo collection.
<metadataset>dls <metadataset>ex
<metadatalist-ex>URL <metadatalist-ex>Plugin <metadatalist-ex>Encoding <metadatalist-ex>Language <metadatalist-ex>SourceFile <metadatalist-ex>Source <metadatalist-ex>FileSize <metadatalist-ex>Title <metadatalist-dls>Subject <metadatalist-dls>Language <metadatalist-dls>Keyword <metadatalist-dls>Organization <metadatalist-dls>Title
<metadatafreq-dls-Subject>17 <metadatafreq-dls-Title>11 <metadatafreq-dls-Organization>11 <metadatafreq-dls-Keyword>6 <metadatafreq-dls-Language>11 <metadatafreq-ex-SourceFile>11 <metadatafreq-ex-Plugin>11 <metadatafreq-ex-URL>11 <metadatafreq-ex-Title>11 <metadatafreq-ex-Encoding>11 <metadatafreq-ex-FileSize>11 <metadatafreq-ex-Language>11 <metadatafreq-ex-Source>11
Note, to view all the entries in the GDBM database, run
db2txt path-to-collection/index/text/collname.gdb > database.txt