User Tools

Site Tools


legacy:manuals:en:user:making_greenstone_collections

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
legacy:manuals:en:user:making_greenstone_collections [2018/06/11 22:19] kjdonlegacy:manuals:en:user:making_greenstone_collections [2023/03/13 01:46] (current) – external edit 127.0.0.1
Line 1: Line 1:
-====== <!-- id:153 -->Making Greenstone Collections ====== 
  
-<!-- id:154 -->The simplest way to build new collections is to use Greenstone's “librarian” interface (GLI). This allows you to collect sets of documents, import or assign metadata, and build them into a Greenstone collection. It supports five basic activities, which can be interleaved but are nominally undertaken in this order: 
  
-  - <!-- id:155 -->Copy documents from the computer's file space, including existing collections, into the new collection. Any existing metadata remains “attached” to these documents. Documents may also be gathered from the web through a built-in mirroring facility. 
-  - <!-- id:156 -->Enrich the documents by adding further metadata to individual documents or groups of documents. 
-  - <!-- id:157 -->Design the collection by determining its appearance and the access facilities that it will support. 
-  - <!-- id:158 -->Build the collection using Greenstone. 
-  - <!-- id:159 -->Preview the newly created collection, which will have been installed on your Greenstone home page as one of the regular collections. 
  
-<!-- id:160 -->The librarian interface allows you to add what people call “external” metadata to documents, metadata that pertains to the document as a whole. But documents often need to be structured into sections and subsections, and “internal” metadata might be associated with each part. In Greenstone, source documents can be tagged with this information, and we explain this in Section [[#tagging_document_files|tagging_document_files]].+====== Making Greenstone Collections ======
  
-<!-- id:161 -->Finally, an alternative way of building collections is provided by the Collector, which helps you create new collectionsmodify or add to existing onesor delete collections. It predates the librarian interfaceand for most practical purposes the librarian interface should be used instead of the Collector. It is described in Section [[#the_collector|the_collector]].+The simplest way to build new collections is to use Greenstone's “librarian” interface (GLI). This allows you to collect sets of documentsimport or assign metadataand build them into a Greenstone collection. It supports five basic activitieswhich can be interleaved but are nominally undertaken in this order:
  
-<!-- id:162 -->To harness the full power of Greenstone to build advanced collectionsyou will also need to read Chapter [[?do=search&id=getting_the_most_out_of_your_documents @en:manuals:Develop|getting_the_most_out_of_your_documents]] of the //Developer's Guide//.+  Copy documents from the computer's file space, including existing collections, into the new collection. Any existing metadata remains “attached” to these documents. Documents may also be gathered from the web through a built-in mirroring facility. 
 +  Enrich the documents by adding further metadata to individual documents or groups of documents. 
 +  - Design the collection by determining its appearance and the access facilities that it will support. 
 +  - Build the collection using Greenstone
 +  - Preview the newly created collectionwhich will have been installed on your Greenstone home page as one of the regular collections.
  
-===== <!-- id:163 -->The librarian'interface =====+The librarian interface allows you to add what people call “external” metadata to documents, metadata that pertains to the document as a whole. But documents often need to be structured into sections and subsections, and “internal” metadata might be associated with each part. In Greenstone, source documents can be tagged with this information, and we explain this in Section [[#tagging_document_files|tagging_document_files]].
  
-<!-- id:164 -->To convey the operation of Greenstone's librarian interfacewe work through a simple example. Figures <imgref figure_starting_a_new_collection> to <imgref figure_previewing_the_newly_built_collection> are screen snapshots at various points during the interactionThis example uses documents in the Development Library Subset (DLS) collectionwhich is distributed with Greenstone. For expository purposesthe walkthrough takes the form of a single pass through the steps listed aboveA more realistic pattern of use, however, is for users to switch back and forth through the various stages as the task proceeds.+Finally, an alternative way of building collections is provided by the Collector, which helps you create new collectionsmodify or add to existing ones, or delete collectionsIt predates the librarian interfaceand for most practical purposes the librarian interface should be used instead of the CollectorIt is described in Section [[#the_collector|the_collector]].
  
-<!-- id:165 -->The librarian interface can be run in one of four modesLibrarian Assistant, Librarian, Library Systems Specialist, and Expert. Modes control the level of detail within the interface, and can be changed through 'Preferences' in the 'File' menu. The walkthrough in this section assumes that the librarian interface is operating in the default mode, Librarian.+To harness the full power of Greenstone to build advanced collections, you will also need to read Chapter [[?do=search&id=getting_the_most_out_of_your_documents @en:manuals:Develop|getting_the_most_out_of_your_documents]] of the //Developer's Guide//.
  
-==== <!-- id:166 -->Getting started ====+===== The librarian's interface =====
  
-<!-- id:167 -->Launch the librarian interface under Windows by selecting //Greenstone Digital Library// from the //Programs// section of the //Start// menu and choosing //Librarian Interface//. If you are using Unix, instead type+To convey the operation of Greenstone's librarian interface, we work through a simple example. Figures <imgref figure_starting_a_new_collectionto <imgref figure_previewing_the_newly_built_collection> are screen snapshots at various points during the interaction. This example uses documents in the Development Library Subset (DLS) collection, which is distributed with Greenstone. For expository purposes, the walkthrough takes the form of a single pass through the steps listed above. A more realistic pattern of use, however, is for users to switch back and forth through the various stages as the task proceeds. 
 + 
 +The librarian interface can be run in one of four modes: Librarian Assistant, Librarian, Library Systems Specialist, and Expert. Modes control the level of detail within the interface, and can be changed through 'Preferences' in the 'File' menu. The walkthrough in this section assumes that the librarian interface is operating in the default mode, Librarian. 
 + 
 +==== Getting started ==== 
 + 
 +Launch the librarian interface under Windows by selecting //Greenstone Digital Library// from the //Programs// section of the //Start// menu and choosing //Librarian Interface//. If you are using Unix, instead type
  
 <code> <code>
Line 31: Line 34:
 </code> </code>
  
-<!-- id:168 -->where //~/gsdl// is the directory containing your Greenstone system. To begin, you must either open an existing collection or start a new one. Figure <imgref figure_starting_a_new_collection> shows the user in the process of starting a new collection. She has selected //New// from the file menu and begun to fill out general information about the collection—its title, the E-mail address of the person responsible for it, and a brief description of the content—in the popup window. The collection title is a short phrase used throughout the digital library to identify the collection's content: existing collections have names like //Food and Nutrition Library//, //World Environmental Library//, and so on. When you type the title, the system assigns a unique mnemonic identifier, the collection “name”, for internal use (you can change it if you like). The E-mail address specifies the first point of contact for any problems encountered with the collection.+where //~/gsdl// is the directory containing your Greenstone system. To begin, you must either open an existing collection or start a new one. Figure <imgref figure_starting_a_new_collection> shows the user in the process of starting a new collection. She has selected //New// from the file menu and begun to fill out general information about the collection—its title, the E-mail address of the person responsible for it, and a brief description of the content—in the popup window. The collection title is a short phrase used throughout the digital library to identify the collection's content: existing collections have names like //Food and Nutrition Library//, //World Environmental Library//, and so on. When you type the title, the system assigns a unique mnemonic identifier, the collection “name”, for internal use (you can change it if you like). The E-mail address specifies the first point of contact for any problems encountered with the collection.
  
-<!-- id:169 -->The brief description is a statement describing the principles that govern what is included in the collection. It appears under the heading //About this collection// on the collection's initial page.+The brief description is a statement describing the principles that govern what is included in the collection. It appears under the heading //About this collection// on the collection's initial page.
  
 <imgcaption figure_starting_a_new_collection|%!-- id:170 --%Starting a new collection ></imgcaption> <imgcaption figure_starting_a_new_collection|%!-- id:170 --%Starting a new collection ></imgcaption>
Line 41: Line 44:
 {{..:images:user_fig_5.png?407x318&direct}} {{..:images:user_fig_5.png?407x318&direct}}
  
-<!-- id:172 -->At this point, the user decides whether to base the new collection on the same structure as an existing collection, or to build an entirely new kind of collection. In Figure <imgref figure_starting_a_new_collection> she has chosen to base it on the //Development Library Subset// collection. This implies that the “DLS” metadata set which is used in this collection will be used for the new collection. (In fact, this metadata set has been used to build several Greenstone collections that share a common structure and organization but with different content, including the //Development Library Subset// and //Demo// collections delivered as samples with Greenstone.)+At this point, the user decides whether to base the new collection on the same structure as an existing collection, or to build an entirely new kind of collection. In Figure <imgref figure_starting_a_new_collection> she has chosen to base it on the //Development Library Subset// collection. This implies that the “DLS” metadata set which is used in this collection will be used for the new collection. (In fact, this metadata set has been used to build several Greenstone collections that share a common structure and organization but with different content, including the //Development Library Subset// and //Demo// collections delivered as samples with Greenstone.)
  
-<!-- id:173 -->The DLS metadata set contains these items:+The DLS metadata set contains these items:
  
-  * <!-- id:174 -->Title +  * Title 
-  * <!-- id:175 -->Subject +  * Subject 
-  * <!-- id:176 -->Language +  * Language 
-  * <!-- id:177 -->Organization +  * Organization 
-  * <!-- id:178 -->Keyword (i.e.”Howto”).+  * Keyword (i.e.”Howto”).
  
-<!-- id:179 -->(There is, in addition, a metadata item called //AZList// which is used to determine which bucket of the alphabetic list contains the document's title, with values like “A-B” or “C-D-E”. This is used to give precise control over thedivisions in the list. For most other collections it is absent, and Greenstone assigns the buckets itself.)+(There is, in addition, a metadata item called //AZList// which is used to determine which bucket of the alphabetic list contains the document's title, with values like “A-B” or “C-D-E”. This is used to give precise control over thedivisions in the list. For most other collections it is absent, and Greenstone assigns the buckets itself.)
  
-<!-- id:180 -->If, instead, the user had chosen “New Collection” at this point, she would have been asked to select what metadata sets should be used in the new collection. Three standard sets are pre-supplied: Dublin Core, the DLS metadata set mentioned above, and a set that comprises metadata elements extracted automatically by Greenstone from the documents in the collection. The user can also create new metadata sets using a popup panel activated through the “metadata” menu.+If, instead, the user had chosen “New Collection” at this point, she would have been asked to select what metadata sets should be used in the new collection. Three standard sets are pre-supplied: Dublin Core, the DLS metadata set mentioned above, and a set that comprises metadata elements extracted automatically by Greenstone from the documents in the collection. The user can also create new metadata sets using a popup panel activated through the “metadata” menu.
  
-<!-- id:181 -->Several different metadata sets can be associated with the same collection; the system keeps them distinct (so that, for example, documents can have both a Dublin Core //Title// and a DLS //Title//). The different sets are clearly distinguished in the interface. Behind the scenes, metadata sets are represented in XML.+Several different metadata sets can be associated with the same collection; the system keeps them distinct (so that, for example, documents can have both a Dublin Core //Title// and a DLS //Title//). The different sets are clearly distinguished in the interface. Behind the scenes, metadata sets are represented in XML.
  
-==== <!-- id:182 -->Assembling the source material ====+==== Assembling the source material ====
  
-<!-- id:183 -->After clicking the //OK// button on the “new collection” popup, the remaining parts of the interface, which were grayed out before, become active. The //Gather// panel, selected by the eponymous tab near the top of Figure <imgref figure_starting_a_new_collection>, is displayed initially. This allows the user to explore the local file space and existing collections, gathering up selected documents for the new collection. The panel is divided into two sections, the left for browsing existing structures and the right for the documents in the collection.+After clicking the //OK// button on the “new collection” popup, the remaining parts of the interface, which were grayed out before, become active. The //Gather// panel, selected by the eponymous tab near the top of Figure <imgref figure_starting_a_new_collection>, is displayed initially. This allows the user to explore the local file space and existing collections, gathering up selected documents for the new collection. The panel is divided into two sections, the left for browsing existing structures and the right for the documents in the collection.
  
-<!-- id:184 -->Operations available at this stage include:+Operations available at this stage include:
  
-  * <!-- id:185 -->Navigating the existing file structure hierarchy, and the one being created, in the usual way. +  * Navigating the existing file structure hierarchy, and the one being created, in the usual way. 
-  * <!-- id:186 -->Dragging and dropping files into the new collection. +  * Dragging and dropping files into the new collection. 
-  * <!-- id:187 -->Multiple selection of files. +  * Multiple selection of files. 
-  * <!-- id:188 -->Dragging and dropping entire sub-hierarchies. +  * Dragging and dropping entire sub-hierarchies. 
-  * <!-- id:189 -->Deleting documents from the nascent collection. +  * Deleting documents from the nascent collection. 
-  * <!-- id:190 -->Creating new sub-hierarchies within the collection. +  * Creating new sub-hierarchies within the collection. 
-  * <!-- id:191 -->Filtering the files that are visible, in both the local file system and the collection, based on predetermined groups or on standard file matching terms. +  * Filtering the files that are visible, in both the local file system and the collection, based on predetermined groups or on standard file matching terms. 
-  * <!-- id:192 -->Invoking the appropriate program to display the contents of a selected file, by double-clicking it.+  * Invoking the appropriate program to display the contents of a selected file, by double-clicking it.
  
-<!-- id:193 -->Care is taken to deal appropriately with name clashes when files of the same name in different parts of the computer's directory structure are copied into the same folder of the collection.+Care is taken to deal appropriately with name clashes when files of the same name in different parts of the computer's directory structure are copied into the same folder of the collection.
  
-<!-- id:194 -->In Figure <imgref figure_exploring_the_local_file_space> the user is using the interactive file tree display to explore the local file system. At this stage, the collection on the right is empty; the user populates it by dragging and dropping files of interest from the left to the right panel. Such files are “copied” rather than “moved”: so as not to disturb the original file system. The usual techniques for multiple selection, dragging and dropping, structuring the new collection by creating subdirectories (“folders”), and deleting files from it by moving them to a trashcan, are all available.+In Figure <imgref figure_exploring_the_local_file_space> the user is using the interactive file tree display to explore the local file system. At this stage, the collection on the right is empty; the user populates it by dragging and dropping files of interest from the left to the right panel. Such files are “copied” rather than “moved”: so as not to disturb the original file system. The usual techniques for multiple selection, dragging and dropping, structuring the new collection by creating subdirectories (“folders”), and deleting files from it by moving them to a trashcan, are all available.
  
-<!-- id:195 -->Existing collections are represented by a subdirectory on the left called “Greenstone Collections,” which can be opened and explored like any other directory. However, the documents therein differ from ordinary files because they already have metadata attached, and this is preserved when they are moved into the new collection. Conflicts may arise because their metadata may have been assigned using a different metadata set from the one in use for the new collection, and the user must resolve these. In Figure <imgref figure_importing_existing_metadata> the user has selected some documents from an existing collection and dragged them into the new one. The popup window explains that the metadata element //Organization// cannot be automatically imported, and asks the user to either select a metadata set and press //Add// to add the metadata element to that set((<!-- id:778 -->This option is disabled if an element of the same name already exists.)), or choose a metadata set, then an element, and press //Merge// to effectively rename the old metadata element to the new one by merging the two. Metadata in subsequent documents from the same collection will automatically be handled in the same way.+Existing collections are represented by a subdirectory on the left called “Greenstone Collections,” which can be opened and explored like any other directory. However, the documents therein differ from ordinary files because they already have metadata attached, and this is preserved when they are moved into the new collection. Conflicts may arise because their metadata may have been assigned using a different metadata set from the one in use for the new collection, and the user must resolve these. In Figure <imgref figure_importing_existing_metadata> the user has selected some documents from an existing collection and dragged them into the new one. The popup window explains that the metadata element //Organization// cannot be automatically imported, and asks the user to either select a metadata set and press //Add// to add the metadata element to that set((This option is disabled if an element of the same name already exists.)), or choose a metadata set, then an element, and press //Merge// to effectively rename the old metadata element to the new one by merging the two. Metadata in subsequent documents from the same collection will automatically be handled in the same way.
  
-<!-- id:196 -->When large file sets are selected, dragged, and dropped into the new collection, the copying operation may take some time—particularly if metadata conversion is involved. To indicate progress, the interface shows which file is being copied and what percentage of files has been processed.+When large file sets are selected, dragged, and dropped into the new collection, the copying operation may take some time—particularly if metadata conversion is involved. To indicate progress, the interface shows which file is being copied and what percentage of files has been processed.
  
-<!-- id:197 -->Special facilities are provided for dealing with large file sets. For example, the user can choose to filter the file tree to show only certain files, using a dropdown menu of file types displayed underneath the trees. In Figure <imgref figure_filtering_the_file_trees>, only the HTM and HTML files are being shown (and only these files will be copied by drag and drop).+Special facilities are provided for dealing with large file sets. For example, the user can choose to filter the file tree to show only certain files, using a dropdown menu of file types displayed underneath the trees. In Figure <imgref figure_filtering_the_file_trees>, only the HTM and HTML files are being shown (and only these files will be copied by drag and drop).
  
-==== <!-- id:198 -->Enriching the documents ====+==== Enriching the documents ====
  
-<!-- id:199 -->The next phase in collection building is to enrich the documents by adding metadata. The //Enrich// tab brings up a new panel of information (Figure <imgref figure_assigning_metadata_using_enrich_view>), which shows the document tree representing the collection on the left and on the right allows metadata to be added to individual documents, or groups of documents.+The next phase in collection building is to enrich the documents by adding metadata. The //Enrich// tab brings up a new panel of information (Figure <imgref figure_assigning_metadata_using_enrich_view>), which shows the document tree representing the collection on the left and on the right allows metadata to be added to individual documents, or groups of documents.
  
-<!-- id:200 -->Documents that are copied during the first step come with any applicable metadata attached. If a document is part of a Greenstone collection, previously defined metadata is carried over to the new collection. Of course, this new collection may have a different metadata set, or perhaps just a subset of the defined metadata, and only metadata that pertains to the new collection's set is carried over. Resolution of such conflicts may require user intervention via a supplementary dialog (Figure <imgref figure_importing_existing_metadata>). Any choices made are remembered for subsequent file copies.+Documents that are copied during the first step come with any applicable metadata attached. If a document is part of a Greenstone collection, previously defined metadata is carried over to the new collection. Of course, this new collection may have a different metadata set, or perhaps just a subset of the defined metadata, and only metadata that pertains to the new collection's set is carried over. Resolution of such conflicts may require user intervention via a supplementary dialog (Figure <imgref figure_importing_existing_metadata>). Any choices made are remembered for subsequent file copies.
  
-<!-- id:201 -->The //Enrich// panel allows metadata values to be assigned to documents in the collection. For example, new values can be added to the set of existing values for an element. If the element's values have a hierarchical structure, the hierarchy can be extended in the same way.+The //Enrich// panel allows metadata values to be assigned to documents in the collection. For example, new values can be added to the set of existing values for an element. If the element's values have a hierarchical structure, the hierarchy can be extended in the same way.
  
 <imgcaption figure_importing_existing_metadata|%!-- id:202 --%Importing existing metadata ></imgcaption> <imgcaption figure_importing_existing_metadata|%!-- id:202 --%Importing existing metadata ></imgcaption>
Line 102: Line 105:
 {{..:images:user_fig_9.png?407x317&direct}} {{..:images:user_fig_9.png?407x317&direct}}
  
-<!-- id:206 -->Metadata values can also be assigned to folders, in just the same way. Documents in these folders for which this metadata is unspecified inherit the metadata values. However, they can subsequently be overridden by supplying different ones for the document itself.+Metadata values can also be assigned to folders, in just the same way. Documents in these folders for which this metadata is unspecified inherit the metadata values. However, they can subsequently be overridden by supplying different ones for the document itself.
  
-<!-- id:207 -->Operations at this stage include:+Operations at this stage include:
  
-  * <!-- id:208 -->Assigning new and existing metadata values to documents. +  * Assigning new and existing metadata values to documents. 
-  * <!-- id:209 -->Assigning metadata to an individual document. +  * Assigning metadata to an individual document. 
-  * <!-- id:210 -->Assigning metadata to a folder (this is inherited by all documentsin the folder, including those in nested folders). +  * Assigning metadata to a folder (this is inherited by all documentsin the folder, including those in nested folders). 
-  * <!-- id:211 -->Assigning hierarchical metadata, whose structure can be dynamically updated if required. +  * Assigning hierarchical metadata, whose structure can be dynamically updated if required. 
-  * <!-- id:212 -->Editing or updating assigned metadata. +  * Editing or updating assigned metadata. 
-  * <!-- id:213 -->Reviewing the metadata assigned to a selection of files and directories.+  * Reviewing the metadata assigned to a selection of files and directories.
  
-<!-- id:214 -->For our walkthrough example, in Figure <imgref figure_assigning_metadata_using_enrich_view> the user has selected the folder //ec121e// and assigned “EC Courier” as its //Organization// metadata. The buttons for updating and removing metadata become active depending on what selections have been made.+For our walkthrough example, in Figure <imgref figure_assigning_metadata_using_enrich_view> the user has selected the folder //ec121e// and assigned “EC Courier” as its //Organization// metadata. The buttons for updating and removing metadata become active depending on what selections have been made.
  
-<!-- id:215 -->During the enrichment phase, or indeed at any other time, the user can choose to view all the metadata that has been assigned to documents in the collection. This is done by selecting a set of documents and choosing //Assigned Metadata// from the metadata sets menu, which brings up a popup window like that in Figure <imgref figure_viewing_all_metadata_for_selected_files> that shows the metadata in spreadsheet form. For large collections it is useful to be able to view the metadata associated with certain document types only, and if the user has specified a file filter as mentioned above, only the selected documents are shown in the metadata display.+During the enrichment phase, or indeed at any other time, the user can choose to view all the metadata that has been assigned to documents in the collection. This is done by selecting a set of documents and choosing //Assigned Metadata// from the metadata sets menu, which brings up a popup window like that in Figure <imgref figure_viewing_all_metadata_for_selected_files> that shows the metadata in spreadsheet form. For large collections it is useful to be able to view the metadata associated with certain document types only, and if the user has specified a file filter as mentioned above, only the selected documents are shown in the metadata display.
  
-<!-- id:216 -->The panel in Figure <imgref figure_editing_the_metadata_set> allows the user to edit metadata sets. Here, the user is looking at the //Subject// element of the DLS set. The values of this element form a hierarchy, and the user is examining, and perhaps changing, the list of values assigned to it. The same panel also allows you to change the “profile” for mapping elements of one metadata set to another. This profile is created when importing documents from collections that have pre-assigned metadata.+The panel in Figure <imgref figure_editing_the_metadata_set> allows the user to edit metadata sets. Here, the user is looking at the //Subject// element of the DLS set. The values of this element form a hierarchy, and the user is examining, and perhaps changing, the list of values assigned to it. The same panel also allows you to change the “profile” for mapping elements of one metadata set to another. This profile is created when importing documents from collections that have pre-assigned metadata.
  
 <imgcaption figure_editing_the_metadata_set|%!-- id:217 --%Editing the metadata set ></imgcaption> <imgcaption figure_editing_the_metadata_set|%!-- id:217 --%Editing the metadata set ></imgcaption>
Line 131: Line 134:
 {{..:images:user_fig_13.png?407x317&direct}} {{..:images:user_fig_13.png?407x317&direct}}
  
-==== <!-- sid:designing_the_collection_1 --><!-- id:221 -->Designing the collection ====+==== Designing the collection ====
  
-<!-- id:222 -->The //Design// panel (Figures <imgref figure_designing_the_collection>—<imgref figure_configuring_arguments_to_a_plug-in>) allows one to specify the structure, organization, and presentation of the collection being created. As noted earlier, the result of this process is recorded in a “collection configuration file,” which is Greenstone's way of expressing the facilities that a collection requires. This step involves a series of separate interaction screens, each dealing with one aspect of the collection design. In effect, it serves as a graphical equivalent to the usual process of editing the configuration file manually.+The //Design// panel (Figures <imgref figure_designing_the_collection>—<imgref figure_configuring_arguments_to_a_plug-in>) allows one to specify the structure, organization, and presentation of the collection being created. As noted earlier, the result of this process is recorded in a “collection configuration file,” which is Greenstone's way of expressing the facilities that a collection requires. This step involves a series of separate interaction screens, each dealing with one aspect of the collection design. In effect, it serves as a graphical equivalent to the usual process of editing the configuration file manually.
  
-<!-- id:223 -->Operations include:+Operations include:
  
-  * <!-- id:224 -->Reviewing and editing collection-level metadata such as title, author and public availability of the collection. +  * Reviewing and editing collection-level metadata such as title, author and public availability of the collection. 
-  * <!-- id:225 -->Defining what full-text indexes are to be built. +  * Defining what full-text indexes are to be built. 
-  * <!-- id:226 -->Creating sub-collections and having indexes built for them. +  * Creating sub-collections and having indexes built for them. 
-  * <!-- id:227 -->Adding or removing support for predefined interface languages. +  * Adding or removing support for predefined interface languages. 
-  * <!-- id:228 -->Constructing a list of plug-ins to be used, and their arguments. +  * Constructing a list of plug-ins to be used, and their arguments. 
-  * <!-- id:229 -->Presenting the list to the user for review and modification. +  * Presenting the list to the user for review and modification. 
-  * <!-- id:230 -->Configuring individual plug-ins. +  * Configuring individual plug-ins. 
-  * <!-- id:231 -->Constructing a list of “classifiers,” their arguments, assignment and configuration. +  * Constructing a list of “classifiers,” their arguments, assignment and configuration. 
-  * <!-- id:232 -->Assigning formatting strings to various controls within the collection, thus altering its appearance. +  * Assigning formatting strings to various controls within the collection, thus altering its appearance. 
-  * <!-- id:233 -->Reviewing the metadata sets, and their elements, used in the collection.+  * Reviewing the metadata sets, and their elements, used in the collection.
  
-<!-- id:234 -->In Figure <imgref figure_designing_the_collection> the user has clicked the //Design// tab and is reviewing the general information about the collection, entered when the new collection was created. On the left are listed the various facets that the user can configure: General, Document Plug-ins, Search Types, Search Indexes, Partition Indexes, Cross-Collection Search, Browsing Classifiers, Format Features, Translate Text, Metadata Sets. Appearance and functionality varies between these. For example, clicking the //Plug-in// button brings up the screen shown in Figure <imgref figure_specifying_which_plug-ins_to_use>, which allows you to add, remove or configure plug-ins, and change the order in which the plug-ins are applied to documents.+In Figure <imgref figure_designing_the_collection> the user has clicked the //Design// tab and is reviewing the general information about the collection, entered when the new collection was created. On the left are listed the various facets that the user can configure: General, Document Plug-ins, Search Types, Search Indexes, Partition Indexes, Cross-Collection Search, Browsing Classifiers, Format Features, Translate Text, Metadata Sets. Appearance and functionality varies between these. For example, clicking the //Plug-in// button brings up the screen shown in Figure <imgref figure_specifying_which_plug-ins_to_use>, which allows you to add, remove or configure plug-ins, and change the order in which the plug-ins are applied to documents.
  
-<!-- id:235 -->Plug-ins and classifiers have many different arguments or “options” that the user can supply. The dialog box in Figure <imgref figure_configuring_arguments_to_a_plug-in> shows the user specifying arguments to some of the plug-ins. The grayed-out fields become active when the user adds the option by clicking the tick-box beside it. Because Greenstone is a continually growing open-source system, the number of options tends to increase as developers add new facilities. To help cope with this, Greenstone has a “plug-in information” utility program that lists the options available for each plug-in, and the librarian interface automatically invokes this to determine what options to show. This allows the interactive user interface to automatically keep pace with developments in the software.+Plug-ins and classifiers have many different arguments or “options” that the user can supply. The dialog box in Figure <imgref figure_configuring_arguments_to_a_plug-in> shows the user specifying arguments to some of the plug-ins. The grayed-out fields become active when the user adds the option by clicking the tick-box beside it. Because Greenstone is a continually growing open-source system, the number of options tends to increase as developers add new facilities. To help cope with this, Greenstone has a “plug-in information” utility program that lists the options available for each plug-in, and the librarian interface automatically invokes this to determine what options to show. This allows the interactive user interface to automatically keep pace with developments in the software.
  
 <imgcaption figure_getting_ready_to_create_new_collection|%!-- id:236 --%Getting ready to create new collection ></imgcaption> <imgcaption figure_getting_ready_to_create_new_collection|%!-- id:236 --%Getting ready to create new collection ></imgcaption>
Line 158: Line 161:
 {{..:images:user_fig_15.png?407x291&direct}} {{..:images:user_fig_15.png?407x291&direct}}
  
-==== <!-- id:238 -->Building the collection ====+==== Building the collection ====
  
-<!-- id:239 -->The //Create// panel (Figure <imgref figure_getting_ready_to_create_new_collection>) is used to construct a collection based on the documents and assigned metadata. The brunt of this work is borne by the Greenstone code itself. The user controls this external process through a series of separate interaction screens, each dealing with the arguments provided to a certain stage of the creation process.+The //Create// panel (Figure <imgref figure_getting_ready_to_create_new_collection>) is used to construct a collection based on the documents and assigned metadata. The brunt of this work is borne by the Greenstone code itself. The user controls this external process through a series of separate interaction screens, each dealing with the arguments provided to a certain stage of the creation process.
  
-<!-- id:240 -->The user observes the building process though a window that shows not only the text output generated by Greenstone's importing and index-building scripts, but also progress bars that indicate the overall degree of completion of each script.+The user observes the building process though a window that shows not only the text output generated by Greenstone's importing and index-building scripts, but also progress bars that indicate the overall degree of completion of each script.
  
-<!-- id:241 -->Figure <imgref figure_getting_ready_to_create_new_collection> shows the //Create// view. At the top are shown some options that can be applied during the creation process. The user selects appropriate values for the options. This figure illustrates a popup “tool tip” that is available throughout the interface to explain the function of each argument.+Figure <imgref figure_getting_ready_to_create_new_collection> shows the //Create// view. At the top are shown some options that can be applied during the creation process. The user selects appropriate values for the options. This figure illustrates a popup “tool tip” that is available throughout the interface to explain the function of each argument.
  
-<!-- id:242 -->When satisfied with the arguments, the user clicks //Build Collection//. Greenstone continually prints text that indicates progress, and this is shown along with a more informative progress bar.+When satisfied with the arguments, the user clicks //Build Collection//. Greenstone continually prints text that indicates progress, and this is shown along with a more informative progress bar.
  
-==== <!-- id:243 -->Previewing ====+==== Previewing ====
  
-<!-- id:244 -->The //Preview Collection// button (Figure <imgref figure_getting_ready_to_create_new_collection>) is used to view the collection that has been built. Clicking this button launches a web browser showing the home page of the collection (Figure <imgref figure_previewing_the_newly_built_collection>). In practice, previewing often shows up deficiencies in the collection design, or in the individual metadata values, and the user frequently returns to earlier stages to correct these. This button becomes active once the collection has been created. The newly created collection will also have been installed on your Greenstone home page as one of the regular collections.+The //Preview Collection// button (Figure <imgref figure_getting_ready_to_create_new_collection>) is used to view the collection that has been built. Clicking this button launches a web browser showing the home page of the collection (Figure <imgref figure_previewing_the_newly_built_collection>). In practice, previewing often shows up deficiencies in the collection design, or in the individual metadata values, and the user frequently returns to earlier stages to correct these. This button becomes active once the collection has been created. The newly created collection will also have been installed on your Greenstone home page as one of the regular collections.
  
-==== <!-- id:245 -->Help ====+==== Help ====
  
-<!-- id:246 -->On-line help is always available, and is invoked using the //Help// item at the right of the main menu bar at the top of each of the Figures. This opens up a hierarchically structured file of help text, and account is taken of the user's current context to highlight the section that is appropriate to the present stage of the interaction. Furthermore, as noted above, whenever the mouse is held still over any interactive object a small window pops up to give a textual “tool tip,” as illustrated near the bottom of Figure <imgref figure_getting_ready_to_create_new_collection>.+On-line help is always available, and is invoked using the //Help// item at the right of the main menu bar at the top of each of the Figures. This opens up a hierarchically structured file of help text, and account is taken of the user's current context to highlight the section that is appropriate to the present stage of the interaction. Furthermore, as noted above, whenever the mouse is held still over any interactive object a small window pops up to give a textual “tool tip,” as illustrated near the bottom of Figure <imgref figure_getting_ready_to_create_new_collection>.
  
-===== <!-- id:247 -->Librarian Interface user guide =====+===== Librarian Interface user guide =====
  
 &chap_gli; &chap_gli;
-===== <!-- id:453 -->Tagging document files =====+===== Tagging document files =====
  
-<!-- id:454 -->Source documents often need to be structured into sections and subsections, and this information needs to be communicated to Greenstone so that it can preserve the hierarchical structure. Also, metadata - typically the title - might be associated with each section and subsection.+Source documents often need to be structured into sections and subsections, and this information needs to be communicated to Greenstone so that it can preserve the hierarchical structure. Also, metadata - typically the title - might be associated with each section and subsection.
  
-<!-- id:455 -->The source documents from an OCR process are typically a set of word processor files, including images. If these are represented as MicrosoftWord files, they can be input into Greenstone using the Word plugin. Alternatively, they can be converted to HTML and input using the HTML plugin.+The source documents from an OCR process are typically a set of word processor files, including images. If these are represented as MicrosoftWord files, they can be input into Greenstone using the Word plugin. Alternatively, they can be converted to HTML and input using the HTML plugin.
  
-<!-- id:456 -->In either case, the hierarchical structure of a document may be indicated by inserting tags in the text as follows:+In either case, the hierarchical structure of a document may be indicated by inserting tags in the text as follows:
  
 <code> <code>
Line 191: Line 194:
 <Section> <Section>
 <Description> <Description>
-<!-- id:457 --><Metadata name="Title">Realizing human rights for poor people: Strategies for achieving the international development targets</Metadata>+<Metadata name="Title">Realizing human rights for poor people: Strategies for achieving the international development targets</Metadata>
 </Description> </Description>
 --> -->
 </code> </code>
  
-<!-- id:458 -->//(text of section goes here)//+//(text of section goes here)//
  
 <code> <code>
Line 204: Line 207:
 </code> </code>
  
-<!-- id:459 -->The %!-- ... --% markers are used because they indicate comments in HTML; thus these section tags will not affect document formatting. You must include these markers around your section tags, even if the document you are working with is not HTML (e.g. if it's a Microsoft Word file).+The %!-- ... --% markers are used because they indicate comments in HTML; thus these section tags will not affect document formatting. You must include these markers around your section tags, even if the document you are working with is not HTML (e.g. if it's a Microsoft Word file).
  
-<!-- id:460 -->In the Description part (between the <Description> and </Description> tags) other kinds of metadata can be specified, but this is not done for the style of collections we are describing here.+In the Description part (between the <Description> and </Description> tags) other kinds of metadata can be specified, but this is not done for the style of collections we are describing here.
  
-<!-- id:461 -->It is important to remember that you are creating a hierarchical table of contents when you insert section tags into your document. This means that sections can be nested within other sections. In fact, all sections must be nested within a single enclosing section that encompasses the entire document.+It is important to remember that you are creating a hierarchical table of contents when you insert section tags into your document. This means that sections can be nested within other sections. In fact, all sections must be nested within a single enclosing section that encompasses the entire document.
  
-<!-- id:462 -->The following example demonstrates a document with two chapters, the second of which contains two subsections. For real examples of sourcedocuments tagged in this way, look at the source documents for the Demo or DLS collections.+The following example demonstrates a document with two chapters, the second of which contains two subsections. For real examples of sourcedocuments tagged in this way, look at the source documents for the Demo or DLS collections.
  
 <code> <code>
Line 223: Line 226:
 </Description> </Description>
 --> -->
-<!-- id:463 -->(text of chapter 1 goes here)+(text of chapter 1 goes here)
 <!-- <!--
 </Section> </Section>
Line 235: Line 238:
 </Description> </Description>
 --> -->
-<!-- id:464 -->(text of sub-section 1 goes here)+(text of sub-section 1 goes here)
 <!-- <!--
 </Section> </Section>
Line 243: Line 246:
 </Description> </Description>
 --> -->
-<!-- id:465 -->(text of sub-section 2 goes here)+(text of sub-section 2 goes here)
 <!-- <!--
 </Section> </Section>
Line 251: Line 254:
 </code> </code>
  
-<!-- id:466 -->Note that metadata assigned from within a section tag in a source document takes precedence over that assigned to the document as a whole. This means that you should not explicitly specify Title metadata for the top-level section within a source document unless you want it to override the title you gave it when specifying metadata. In the above example, unless you want to override the document's existing title you should omit the line that reads:+Note that metadata assigned from within a section tag in a source document takes precedence over that assigned to the document as a whole. This means that you should not explicitly specify Title metadata for the top-level section within a source document unless you want it to override the title you gave it when specifying metadata. In the above example, unless you want to override the document's existing title you should omit the line that reads:
  
 <code> <code>
Line 257: Line 260:
 </code> </code>
  
-===== <!-- id:467 -->The Collector =====+===== The Collector =====
  
-<!-- id:468 -->The Collector is a facility that helps you create new collections, modify or add to existing ones, or delete collections. To do this you will be guided through a sequence of web pages which request the information that is needed. The sequence is self-explanatory: this section takes you through it. As an alternative to using the Collector, you can also build collections from the command line—the first few pages of the Developer's Guide give a detailed walk-through of how to do this. The Collector predates the librarian interface described in Section [[#the_librarian_interface|the_librarian_interface]], and for most practical purposes the librarian interface should be used instead of the Collector.+The Collector is a facility that helps you create new collections, modify or add to existing ones, or delete collections. To do this you will be guided through a sequence of web pages which request the information that is needed. The sequence is self-explanatory: this section takes you through it. As an alternative to using the Collector, you can also build collections from the command line—the first few pages of the Developer's Guide give a detailed walk-through of how to do this. The Collector predates the librarian interface described in Section [[#the_librarian_interface|the_librarian_interface]], and for most practical purposes the librarian interface should be used instead of the Collector.
  
-<!-- id:469 -->Building and distributing information collections carries responsibilities that you should reflect on before you begin. There are legal issues of copyright: being able to access documents doesn't mean you can necessarily give them to others. There are social issues: collections should respect the customs of the community out of which the documents arise. And there are ethical issues: some things simply should not be made available to others. The pen is mightier than the sword!—be sensitive to the power of information and use it wisely.+Building and distributing information collections carries responsibilities that you should reflect on before you begin. There are legal issues of copyright: being able to access documents doesn't mean you can necessarily give them to others. There are social issues: collections should respect the customs of the community out of which the documents arise. And there are ethical issues: some things simply should not be made available to others. The pen is mightier than the sword!—be sensitive to the power of information and use it wisely.
  
-<!-- id:470 -->To access the Collector, click the appropriate link on the digital library home page.+To access the Collector, click the appropriate link on the digital library home page.
  
-<!-- id:471 -->In Greenstone, the structure of a particular collection is determined when the collection is set up. This includes such things as the format of the source documents, how they should be displayed on the screen, the source of metadata, what browsing facilities should be provided, what full-text search indexes should be provided, and how the search results should be displayed. Once the collection is in place, it is easy to add new documents to it—so long as they have the same format as the existing documents, and the same type of metadata is provided, in exactly the same way.+In Greenstone, the structure of a particular collection is determined when the collection is set up. This includes such things as the format of the source documents, how they should be displayed on the screen, the source of metadata, what browsing facilities should be provided, what full-text search indexes should be provided, and how the search results should be displayed. Once the collection is in place, it is easy to add new documents to it—so long as they have the same format as the existing documents, and the same type of metadata is provided, in exactly the same way.
  
-<!-- id:472 -->The Collector has the following basic functions:+The Collector has the following basic functions:
  
-  - <!-- id:473 -->create a new collection with the same structure as an existing one; +  - create a new collection with the same structure as an existing one; 
-  - <!-- id:474 -->create a new collection with a different structure from existing ones; +  - create a new collection with a different structure from existing ones; 
-  - <!-- id:475 -->add new material to an existing collection; +  - add new material to an existing collection; 
-  - <!-- id:476 -->modify the structure of an existing collection; +  - modify the structure of an existing collection; 
-  - <!-- id:477 -->delete a collection; and +  - delete a collection; and 
-  - <!-- id:478 -->write an existing collection to a self-contained, self-installing cd-rom.+  - write an existing collection to a self-contained, self-installing cd-rom.
  
-<!-- id:479 -->Figure <imgref figure_using_the_collector_to_build_a_new_collection> shows the Collector being used to create a new collection, in this case from a set of html files stored locally. You must first decide whether to work with an existing collection or build a new one. The former case covers options 1 and 2 above; the latter covers options 3—6. In Figure <imgref figure_using_the_collector_to_build_a_new_collection>, the user opts to create a new collection.+Figure <imgref figure_using_the_collector_to_build_a_new_collection> shows the Collector being used to create a new collection, in this case from a set of html files stored locally. You must first decide whether to work with an existing collection or build a new one. The former case covers options 1 and 2 above; the latter covers options 3—6. In Figure <imgref figure_using_the_collector_to_build_a_new_collection>, the user opts to create a new collection.
  
 <imgcaption figure_using_the_collector_to_build_a_new_collection|%!-- id:481 --%(a) %!-- id:480 --%Using the Collector to build a new collection (continued on next pages) ></imgcaption> <imgcaption figure_using_the_collector_to_build_a_new_collection|%!-- id:481 --%(a) %!-- id:480 --%Using the Collector to build a new collection (continued on next pages) ></imgcaption>
 {{..:images:user_fig_16a.png?369x440&direct}} {{..:images:user_fig_16a.png?369x440&direct}}
  
-==== <!-- id:482 -->Logging in ====+==== Logging in ====
  
-<!-- id:483 -->Either way it is necessary to log in before proceeding. Note that in general, people use their web browser to access the collection-building facility on a remote computer, and build the collection on that server. Of course, we cannot allow arbitrary people to build collections (for reasons of propriety if nothing else), so Greenstone contains a security system which forces people who want to build collections to log in first. This allows a central system to offer a service to those wishing to build information collections and use that server to make them available to others. Alternatively, if you are running Greenstone on your own computer you can build collections locally, but it is still necessary to log in because other people who use the Greenstone system on your computer should not be allowed to build collections without prior permission.+Either way it is necessary to log in before proceeding. Note that in general, people use their web browser to access the collection-building facility on a remote computer, and build the collection on that server. Of course, we cannot allow arbitrary people to build collections (for reasons of propriety if nothing else), so Greenstone contains a security system which forces people who want to build collections to log in first. This allows a central system to offer a service to those wishing to build information collections and use that server to make them available to others. Alternatively, if you are running Greenstone on your own computer you can build collections locally, but it is still necessary to log in because other people who use the Greenstone system on your computer should not be allowed to build collections without prior permission.
  
-==== <!-- id:484 -->Dialog structure ====+==== Dialog structure ====
  
 <imgcaption figure_using_the_collector_to_build_a_new_collection_1|%!-- id:486 --%(b) %!-- id:485 --%Using the Collector to build a new collection (Continued) ></imgcaption> <imgcaption figure_using_the_collector_to_build_a_new_collection_1|%!-- id:486 --%(b) %!-- id:485 --%Using the Collector to build a new collection (Continued) ></imgcaption>
 {{..:images:user_fig_16b.png?369x435&direct}} {{..:images:user_fig_16b.png?369x435&direct}}
  
-<!-- id:487 -->Upon completion of login, the page in Figure <imgref figure_using_the_collector_to_build_a_new_collection_1> appears. This shows the sequence of steps that are involved in collection building. They are:+Upon completion of login, the page in Figure <imgref figure_using_the_collector_to_build_a_new_collection_1> appears. This shows the sequence of steps that are involved in collection building. They are:
  
-  - <!-- id:488 -->Collection information +  - Collection information 
-  - <!-- id:489 -->Source data +  - Source data 
-  - <!-- id:490 -->Configuring the collection +  - Configuring the collection 
-  - <!-- id:491 -->Building the collection +  - Building the collection 
-  - <!-- id:492 -->Viewing the collection.+  - Viewing the collection.
  
-<!-- id:493 -->The first step is to specify the collection's name and associated information. The second is to say where the source data is to come from. The third is to adjust the configuration options, a step that becomes more useful as you gain experience with Greenstone. The fourth step is where all the (computer's) work is done. During the “building” process the system makes all the indexes and gathers together any other information that is required to make the collection operate. The fifth step is to view the collection that has been created.+The first step is to specify the collection's name and associated information. The second is to say where the source data is to come from. The third is to adjust the configuration options, a step that becomes more useful as you gain experience with Greenstone. The fourth step is where all the (computer's) work is done. During the “building” process the system makes all the indexes and gathers together any other information that is required to make the collection operate. The fifth step is to view the collection that has been created.
  
-<!-- id:494 -->These five steps are displayed as a linear sequence of gray buttons at the bottom of the screen in Figure <imgref figure_using_the_collector_to_build_a_new_collection_1>, and at the bottom of all other pages generated by the Collector. This display helps users keep track of where they are in the process. The button that should be clicked to continue the sequence is shown in green (//collection information// in Figure <imgref figure_using_the_collector_to_build_a_new_collection_1>). The gray buttons (all the others, in Figure <imgref figure_using_the_collector_to_build_a_new_collection_1>) are inactive. The buttons change to yellow as you proceed through the sequence, and the user can return to an earlier step by clicking the corresponding yellow button in the diagram. This display is modeled after the “wizards” that are widely used in commercial software to guide users through the steps involved in installing new software.+These five steps are displayed as a linear sequence of gray buttons at the bottom of the screen in Figure <imgref figure_using_the_collector_to_build_a_new_collection_1>, and at the bottom of all other pages generated by the Collector. This display helps users keep track of where they are in the process. The button that should be clicked to continue the sequence is shown in green (//collection information// in Figure <imgref figure_using_the_collector_to_build_a_new_collection_1>). The gray buttons (all the others, in Figure <imgref figure_using_the_collector_to_build_a_new_collection_1>) are inactive. The buttons change to yellow as you proceed through the sequence, and the user can return to an earlier step by clicking the corresponding yellow button in the diagram. This display is modeled after the “wizards” that are widely used in commercial software to guide users through the steps involved in installing new software.
  
-==== <!-- id:495 -->Collection information ====+==== Collection information ====
  
 <imgcaption figure_using_the_collector_to_build_a_new_collection_2|%!-- id:497 --%(c) %!-- id:496 --%Using the Collector to build a new collection (Continued) ></imgcaption> <imgcaption figure_using_the_collector_to_build_a_new_collection_2|%!-- id:497 --%(c) %!-- id:496 --%Using the Collector to build a new collection (Continued) ></imgcaption>
 {{..:images:user_fig_16c.png?369x504&direct}} {{..:images:user_fig_16c.png?369x504&direct}}
  
-<!-- id:498 -->The next step in the sequence, collection information, is shown in Figure <imgref figure_using_the_collector_to_build_a_new_collection_2>. When creating a new collection, it is necessary to enter some information about it:+The next step in the sequence, collection information, is shown in Figure <imgref figure_using_the_collector_to_build_a_new_collection_2>. When creating a new collection, it is necessary to enter some information about it:
  
-  * <!-- id:499 -->title, +  * title, 
-  * <!-- id:500 -->contact E-mail address, and +  * contact E-mail address, and 
-  * <!-- id:501 -->brief description.+  * brief description.
  
-<!-- id:502 -->The collection title is a short phrase used through the digital library to identify the content of the collection. Example titles include //Food and Nutrition Library//, //World Environmental Library//, //Development Library//, and so on. The E-mail address specifies the first point of contact for any problems encountered with the collection. If the Greenstone software detects a problem, a diagnostic report may be sent to this address. Finally, the brief description is a statement describing the principles that govern what is included in the collection. It appears under the heading //About this collection// on the first page when the collection is presented.+The collection title is a short phrase used through the digital library to identify the content of the collection. Example titles include //Food and Nutrition Library//, //World Environmental Library//, //Development Library//, and so on. The E-mail address specifies the first point of contact for any problems encountered with the collection. If the Greenstone software detects a problem, a diagnostic report may be sent to this address. Finally, the brief description is a statement describing the principles that govern what is included in the collection. It appears under the heading //About this collection// on the first page when the collection is presented.
  
-<!-- id:503 -->The user's current position in the collection-building sequence is indicated by an arrow that appears in the display at the bottom of each screen—in this case, as Figure <imgref figure_using_the_collector_to_build_a_new_collection_2> shows, the collection information stage. The user proceeds to Figure <imgref figure_using_the_collector_to_build_a_new_collection_3> by clicking the green source data button.+The user's current position in the collection-building sequence is indicated by an arrow that appears in the display at the bottom of each screen—in this case, as Figure <imgref figure_using_the_collector_to_build_a_new_collection_2> shows, the collection information stage. The user proceeds to Figure <imgref figure_using_the_collector_to_build_a_new_collection_3> by clicking the green source data button.
  
-==== <!-- id:504 -->Source data ====+==== Source data ====
  
 <imgcaption figure_using_the_collector_to_build_a_new_collection_3|%!-- id:506 --%(d) %!-- id:505 --%Using the Collector to build a new collection (Continued) ></imgcaption> <imgcaption figure_using_the_collector_to_build_a_new_collection_3|%!-- id:506 --%(d) %!-- id:505 --%Using the Collector to build a new collection (Continued) ></imgcaption>
 {{..:images:user_fig_16d.png?368x532&direct}} {{..:images:user_fig_16d.png?368x532&direct}}
  
-<!-- id:507 -->Figure <imgref figure_using_the_collector_to_build_a_new_collection_3> is the point where the user specifies the source text that comprises the collection. You may either base your collection on a default structure that is provided, or on the structure of an existing collection.+Figure <imgref figure_using_the_collector_to_build_a_new_collection_3> is the point where the user specifies the source text that comprises the collection. You may either base your collection on a default structure that is provided, or on the structure of an existing collection.
  
-<!-- id:508 -->If you opt for the default structure, the new collection may contain html documents (files ending in //.htm, .html//), or plain text documents (files ending in //.txt, .text//), Microsoft Word documents (files ending in //.doc//), PDF documents (files ending in //.pdf//) or E-mail documents (files ending in //.email//). More information about the different document formats that can be accommodated is given in the section on “Document formats” below.+If you opt for the default structure, the new collection may contain html documents (files ending in //.htm, .html//), or plain text documents (files ending in //.txt, .text//), Microsoft Word documents (files ending in //.doc//), PDF documents (files ending in //.pdf//) or E-mail documents (files ending in //.email//). More information about the different document formats that can be accommodated is given in the section on “Document formats” below.
  
-<!-- id:509 -->If you base your new collection on an existing one, the files in the new collection must be exactly the same type as those used to build the existing one. Note that some collections use non-standard input file formats, while others use metadata specified in auxiliary files. If your new input lacks this information, some browsing facilities may not work properly. For example, if you clone the Demo collection you may find that the //subjects//, //organization//, and //how to// buttons don't work.+If you base your new collection on an existing one, the files in the new collection must be exactly the same type as those used to build the existing one. Note that some collections use non-standard input file formats, while others use metadata specified in auxiliary files. If your new input lacks this information, some browsing facilities may not work properly. For example, if you clone the Demo collection you may find that the //subjects//, //organization//, and //how to// buttons don't work.
  
-<!-- id:510 -->Boxes are provided to indicate where the source documents are located: up to three separate input sources can be specified in Figure <imgref figure_using_the_collector_to_build_a_new_collection_3>. If you need more, just click the button marked “more sources.”+Boxes are provided to indicate where the source documents are located: up to three separate input sources can be specified in Figure <imgref figure_using_the_collector_to_build_a_new_collection_3>. If you need more, just click the button marked “more sources.”
  
-<!-- id:511 -->There are three kinds of specification:+There are three kinds of specification:
  
-  * <!-- id:512 -->a directory name on the Greenstone server system (beginning with “file:%%//%%”) +  * a directory name on the Greenstone server system (beginning with “file:%%//%%”) 
-  * <!-- id:513 -->an address beginning with “http:%%//%%” for files to be downloaded from the web +  * an address beginning with “http:%%//%%” for files to be downloaded from the web 
-  * <!-- id:514 -->an address beginning with “ftp:%%//%%” for files to be downloaded using anonymous FTP.+  * an address beginning with “ftp:%%//%%” for files to be downloaded using anonymous FTP.
  
-<!-- id:515 -->If you use //file:%%//%%// or //ftp:%%//%%// to specify a file, that file will be downloaded.+If you use //file:%%//%%// or //ftp:%%//%%// to specify a file, that file will be downloaded.
  
-<!-- id:516 -->If you use //http:%%//%%// it depends on whether the URL gives you a normal web page in your browser, or a list of files. If a page, that page will be downloaded—and so will all pages it links to, and all pages they link to, etc.—provided they reside on the same site, below the URL.+If you use //http:%%//%%// it depends on whether the URL gives you a normal web page in your browser, or a list of files. If a page, that page will be downloaded—and so will all pages it links to, and all pages they link to, etc.—provided they reside on the same site, below the URL.
  
-<!-- id:517 -->If you use //file:%%//%%// or //ftp:%%//%%// to specify a folder or directory, or give a //http:%%//%%// URL that leads to a list of files, everything in the folder and all its subfolders will be included in the collection.+If you use //file:%%//%%// or //ftp:%%//%%// to specify a folder or directory, or give a //http:%%//%%// URL that leads to a list of files, everything in the folder and all its subfolders will be included in the collection.
  
-<!-- id:518 -->You can specify sources of more than one type.+You can specify sources of more than one type.
  
-<!-- id:519 -->In this case (Figure <imgref figure_using_the_collector_to_build_a_new_collection_3>) the new collection will contain documents taken from a local file system as well as a remote web site, which will be mirrored during the building process.+In this case (Figure <imgref figure_using_the_collector_to_build_a_new_collection_3>) the new collection will contain documents taken from a local file system as well as a remote web site, which will be mirrored during the building process.
  
-<!-- id:520 -->When you click the //configure collection// button to proceed to the next stage of building, the Collector checks that all the sources of input you specified can be reached. This might take a few seconds, or even a few minutes if you have specified several sources. If one or more of the input sources you specified is unavailable, you will be presented with a page like that in Figure <imgref figure_using_the_collector_to_build_a_new_collection_4>, where the unavailable sources are marked (both of them in this case).+When you click the //configure collection// button to proceed to the next stage of building, the Collector checks that all the sources of input you specified can be reached. This might take a few seconds, or even a few minutes if you have specified several sources. If one or more of the input sources you specified is unavailable, you will be presented with a page like that in Figure <imgref figure_using_the_collector_to_build_a_new_collection_4>, where the unavailable sources are marked (both of them in this case).
  
 <imgcaption figure_using_the_collector_to_build_a_new_collection_4|%!-- id:522 --%(e) %!-- id:521 --%Using the Collector to build a new collection (Continued) ></imgcaption> <imgcaption figure_using_the_collector_to_build_a_new_collection_4|%!-- id:522 --%(e) %!-- id:521 --%Using the Collector to build a new collection (Continued) ></imgcaption>
 {{..:images:user_fig_16e.png?368x531&direct}} {{..:images:user_fig_16e.png?368x531&direct}}
  
-<!-- id:523 -->Sources might be unavailable because+Sources might be unavailable because
  
-  * <!-- id:524 -->the file, FTP site or URL does not exist; +  * the file, FTP site or URL does not exist; 
-  * <!-- id:525 -->you need to dial up your ISP first; +  * you need to dial up your ISP first; 
-  * <!-- id:526 -->you are trying to access a URL from behind a firewall.+  * you are trying to access a URL from behind a firewall.
  
-<!-- id:527 -->The last case is potentially the most mysterious. It occurs if you normally have to present a username and password to access the Internet Sometimes it happens that you can see the page from your Web browser if you enter the URL, but the Collector claims that it is unavailable. The explanation is that the page in your browser may be coming from a locally cached copy. Unfortunately, locally cached copies are invisible to the Collector. In this case we recommend that you download the pages using your browser first.+The last case is potentially the most mysterious. It occurs if you normally have to present a username and password to access the Internet Sometimes it happens that you can see the page from your Web browser if you enter the URL, but the Collector claims that it is unavailable. The explanation is that the page in your browser may be coming from a locally cached copy. Unfortunately, locally cached copies are invisible to the Collector. In this case we recommend that you download the pages using your browser first.
  
-==== <!-- id:528 -->Configuring the collection ====+==== Configuring the collection ====
  
 <imgcaption figure_using_the_collector_to_build_a_new_collection_5|%!-- id:530 --%(f) %!-- id:529 --%Using the Collector to build a new collection (Continued) ></imgcaption> <imgcaption figure_using_the_collector_to_build_a_new_collection_5|%!-- id:530 --%(f) %!-- id:529 --%Using the Collector to build a new collection (Continued) ></imgcaption>
 {{..:images:user_fig_16f.png?369x467&direct}} {{..:images:user_fig_16f.png?369x467&direct}}
  
-<!-- id:531 -->Figure <imgref figure_using_the_collector_to_build_a_new_collection_5> shows the next stage. The construction and presentation of all collections is controlled by specifications in a special collection configuration file (see below). Advanced users may use this page to alter the configuration settings. Most, however, will proceed directly to the final stage. Indeed, in Figure <imgref figure_using_the_collector_to_build_a_new_collection_3> both the //configure collection// and the //build collection// buttons are displayed in green, signifying that step 3 can be bypassed completely.+Figure <imgref figure_using_the_collector_to_build_a_new_collection_5> shows the next stage. The construction and presentation of all collections is controlled by specifications in a special collection configuration file (see below). Advanced users may use this page to alter the configuration settings. Most, however, will proceed directly to the final stage. Indeed, in Figure <imgref figure_using_the_collector_to_build_a_new_collection_3> both the //configure collection// and the //build collection// buttons are displayed in green, signifying that step 3 can be bypassed completely.
  
-<!-- id:532 -->In our example the user has made a small modification to the default configuration file by including the //file_is_url// flag with the html plugin. This flag causes URL metadata to be inserted in each document, based on the filename convention that is adopted by the mirroring package. This metadata is used in the collection to allow readers to refer to the original source material, rather than to a local copy.+In our example the user has made a small modification to the default configuration file by including the //file_is_url// flag with the html plugin. This flag causes URL metadata to be inserted in each document, based on the filename convention that is adopted by the mirroring package. This metadata is used in the collection to allow readers to refer to the original source material, rather than to a local copy.
  
-==== <!-- id:533 -->Building the collection ====+==== Building the collection ====
  
 <imgcaption figure_using_the_collector_to_build_a_new_collection_6|%!-- id:535 --%(g) %!-- id:534 --%Using the Collector to build a new collection (Continued) ></imgcaption> <imgcaption figure_using_the_collector_to_build_a_new_collection_6|%!-- id:535 --%(g) %!-- id:534 --%Using the Collector to build a new collection (Continued) ></imgcaption>
 {{..:images:user_fig_16g.png?369x304&direct}} {{..:images:user_fig_16g.png?369x304&direct}}
  
-<!-- id:536 -->Figure <imgref figure_using_the_collector_to_build_a_new_collection_6> shows the “building” stage. Up until now, the responses to the dialog have merely been recorded in a temporary file. The building stage is where the action takes place.+Figure <imgref figure_using_the_collector_to_build_a_new_collection_6> shows the “building” stage. Up until now, the responses to the dialog have merely been recorded in a temporary file. The building stage is where the action takes place.
  
-<!-- id:537 -->During building, indexes for both browsing and searching are constructed according to instructions in the collection configuration file. The building process takes some time: minutes to hours, depending on the size of the collection and the speed of your computer. Some very large collections take a day or more to build.+During building, indexes for both browsing and searching are constructed according to instructions in the collection configuration file. The building process takes some time: minutes to hours, depending on the size of the collection and the speed of your computer. Some very large collections take a day or more to build.
  
-<!-- id:538 -->When you reach this stage in the interaction, a status line at the bottom of the web page gives feedback on how the operation is progressing, updated every five seconds. The message visible in Figure <imgref figure_using_the_collector_to_build_a_new_collection_5> indicates that when the snapshot was taken, Title metadata was being extracted from an input file.+When you reach this stage in the interaction, a status line at the bottom of the web page gives feedback on how the operation is progressing, updated every five seconds. The message visible in Figure <imgref figure_using_the_collector_to_build_a_new_collection_5> indicates that when the snapshot was taken, Title metadata was being extracted from an input file.
  
-<!-- id:539 -->Warnings are written if input files or URLs are requested that do not exist, or exist but there is no plugin that can process them, or the plugin cannot find an associated file, such as an image file embedded in a html document. The intention is that you will monitor progress by keeping this window open in your browser. If any errors cause the process to terminate, they are recorded in this status area.+Warnings are written if input files or URLs are requested that do not exist, or exist but there is no plugin that can process them, or the plugin cannot find an associated file, such as an image file embedded in a html document. The intention is that you will monitor progress by keeping this window open in your browser. If any errors cause the process to terminate, they are recorded in this status area.
  
-<!-- id:540 -->You can stop the building process at any time by clicking on the //stop building// button in Figure <imgref figure_using_the_collector_to_build_a_new_collection_6>. If you leave the web page (and have not cancelled the building process with the //stop building// button), the building operation will continue, and the new collection will be installed when the operation completes.+You can stop the building process at any time by clicking on the //stop building// button in Figure <imgref figure_using_the_collector_to_build_a_new_collection_6>. If you leave the web page (and have not cancelled the building process with the //stop building// button), the building operation will continue, and the new collection will be installed when the operation completes.
  
-==== <!-- id:541 -->Viewing the collection ====+==== Viewing the collection ====
  
-<!-- id:542 -->When the collection is built and installed, the sequence of buttons visible at the bottom of Figures <imgref figure_using_the_collector_to_build_a_new_collection_1>—<imgref figure_using_the_collector_to_build_a_new_collection_5> appears at the bottom of Figure <imgref figure_using_the_collector_to_build_a_new_collection_6>, with the View collection button active. This takes the user directly to the newly built collection.+When the collection is built and installed, the sequence of buttons visible at the bottom of Figures <imgref figure_using_the_collector_to_build_a_new_collection_1>—<imgref figure_using_the_collector_to_build_a_new_collection_5> appears at the bottom of Figure <imgref figure_using_the_collector_to_build_a_new_collection_6>, with the View collection button active. This takes the user directly to the newly built collection.
  
-<!-- id:543 -->Finally, there is a facility for E-mail to be sent to the collection's contact E-mail address, and to the system's administrator, whenever a collection is created (or modified.) This allows those responsible to check when changes occur, and monitor what is happening on the system. The facility is disabled by default but can be enabled by editing the //main.cfg// configuration file (see the //Greenstone Digital Library Developer's Guide//, Section [[?do=search&id=configuring_your_greenstone_site @en:manuals:Develop|configuring_your_greenstone_site]]).+Finally, there is a facility for E-mail to be sent to the collection's contact E-mail address, and to the system's administrator, whenever a collection is created (or modified.) This allows those responsible to check when changes occur, and monitor what is happening on the system. The facility is disabled by default but can be enabled by editing the //main.cfg// configuration file (see the //Greenstone Digital Library Developer's Guide//, Section [[?do=search&id=configuring_your_greenstone_site @en:manuals:Develop|configuring_your_greenstone_site]]).
  
-==== <!-- id:544 -->Working with existing collections ====+==== Working with existing collections ====
  
-<!-- id:545 -->When you enter the Collector you have to specify whether you want to create an entirely new collection or work with an existing one, adding data to it or deleting it. By creating all searching and browsing structures automatically from the documents themselves Greenstone makes it easy to add new information to existing collections. Because no links are inserted by hand, when new documents in the same format become available they can be merged into the collection automatically.+When you enter the Collector you have to specify whether you want to create an entirely new collection or work with an existing one, adding data to it or deleting it. By creating all searching and browsing structures automatically from the documents themselves Greenstone makes it easy to add new information to existing collections. Because no links are inserted by hand, when new documents in the same format become available they can be merged into the collection automatically.
  
-<!-- id:546 -->To work with an existing collection, you first select the collection from a list that is provided. Some collections are “write protected” and cannot be altered: these ones don't appear in the selection list. With the collection, you can+To work with an existing collection, you first select the collection from a list that is provided. Some collections are “write protected” and cannot be altered: these ones don't appear in the selection list. With the collection, you can
  
-  * <!-- id:547 -->Add more data and rebuild the collection +  * Add more data and rebuild the collection 
-  * <!-- id:548 -->Edit the collection configuration file +  * Edit the collection configuration file 
-  * <!-- id:549 -->Delete the collection entirely +  * Delete the collection entirely 
-  * <!-- id:550 -->Export the collection to CD-ROM.+  * Export the collection to CD-ROM.
  
-=== <!-- id:551 -->Add new data ===+=== Add new data ===
  
-<!-- id:552 -->The files that you specify will be added to the collection. Make sure that you do not re-specify files that are already in the collection—otherwise two copies will be included. Files are identified by their full pathname, web pages by their absolute web address. You specify directories and files just as you do when building a new collection.+The files that you specify will be added to the collection. Make sure that you do not re-specify files that are already in the collection—otherwise two copies will be included. Files are identified by their full pathname, web pages by their absolute web address. You specify directories and files just as you do when building a new collection.
  
-<!-- id:553 -->If you add data to a collection and for some reason the building process fails, the old version of the collection remains unchanged.+If you add data to a collection and for some reason the building process fails, the old version of the collection remains unchanged.
  
-=== <!-- id:554 -->Edit configuration file ===+=== Edit configuration file ===
  
-<!-- id:555 -->Advanced users can edit the collection configuration file, just as they can when a new collection is built.+Advanced users can edit the collection configuration file, just as they can when a new collection is built.
  
-=== <!-- id:556 -->Delete the collection ===+=== Delete the collection ===
  
-<!-- id:557 -->You will be asked to confirm whether you really want to delete the collection. Once deleted, Greenstone can not bring the collection back!+You will be asked to confirm whether you really want to delete the collection. Once deleted, Greenstone can not bring the collection back!
  
-=== <!-- id:558 -->Export the collection ===+=== Export the collection ===
  
-<!-- id:559 -->You can export the collection in a form that allows it to be written to a self-contained, self-installing Greenstone CD-ROM for Windows. Because commercial software that creates self-installing CD-ROMs is expensive, this facility includes a homegrown installer module.+You can export the collection in a form that allows it to be written to a self-contained, self-installing Greenstone CD-ROM for Windows. Because commercial software that creates self-installing CD-ROMs is expensive, this facility includes a homegrown installer module.
  
-<!-- id:560 -->When you export the collection, the dialogue informs you of the directory name in which the result has been placed. The entire contents of the directory should be written on to CD-ROM using a standard CD-writing utility.+When you export the collection, the dialogue informs you of the directory name in which the result has been placed. The entire contents of the directory should be written on to CD-ROM using a standard CD-writing utility.
  
-<!-- id:561 -->The immense variety of different possible Windows configurations has made it difficult for us to test and debug the Greenstone installer under all possible conditions. Although the installer produces CD-ROMs that operate on most Windows systems, it is still under development. If you experience problems and you possess a commercial installation package (e.g. InstallShield), you can use it to create CD-ROMs from the information that Greenstone provides. The above-mentioned export directory contains four files that relate to the installation process, and three subdirectories that contain the complete collection and software. Remove the four files and use InstallShield to make a CD-ROM image that installs these directories and creates a shortcut to the program //gsdl\server.exe//.+The immense variety of different possible Windows configurations has made it difficult for us to test and debug the Greenstone installer under all possible conditions. Although the installer produces CD-ROMs that operate on most Windows systems, it is still under development. If you experience problems and you possess a commercial installation package (e.g. InstallShield), you can use it to create CD-ROMs from the information that Greenstone provides. The above-mentioned export directory contains four files that relate to the installation process, and three subdirectories that contain the complete collection and software. Remove the four files and use InstallShield to make a CD-ROM image that installs these directories and creates a shortcut to the program //gsdl\server.exe//.
  
-==== <!-- id:562 -->Document formats ====+==== Document formats ====
  
-<!-- id:563 -->When building collections, Greenstone processes each different format of source document by seeking a “plugin” that can deal with that particular format. Plugins are specified in the collection configuration file. Greenstone generally uses the filename to determine document formats—for example, //foo.txt// is processed as a text file, //foo.html// as html, and //foo.doc// as a Word file.+When building collections, Greenstone processes each different format of source document by seeking a “plugin” that can deal with that particular format. Plugins are specified in the collection configuration file. Greenstone generally uses the filename to determine document formats—for example, //foo.txt// is processed as a text file, //foo.html// as html, and //foo.doc// as a Word file.
  
-<!-- id:564 -->Here is a summary of the plugins that are available for widely-used document formats. More detail about these plugins, and additional plugins for less commonly-used formats, can be found in the //Greenstone Digital Library Developer's Guide//.+Here is a summary of the plugins that are available for widely-used document formats. More detail about these plugins, and additional plugins for less commonly-used formats, can be found in the //Greenstone Digital Library Developer's Guide//.
  
-=== <!-- id:565 -->TEXTPlug (*.txt, *.text) ===+=== TEXTPlug (*.txt, *.text) ===
  
-<!-- id:566 -->TEXTPlug interprets a plain text file as a simple document. It adds //title// metadata based on the first line of the file.+TEXTPlug interprets a plain text file as a simple document. It adds //title// metadata based on the first line of the file.
  
-=== <!-- id:567 -->HTMLPlug (*.htm, *.html; also .shtml, .shm, .asp, .php, .cgi) ===+=== HTMLPlug (*.htm, *.html; also .shtml, .shm, .asp, .php, .cgi) ===
  
-<!-- id:568 -->HTMLPlug processes html files. It extracts //title// metadata based on the <title> tag; other metadata expressed using html's metatag syntax can be extracted too. There are many options available with this plugin, documented in the //Greenstone Digital Library Developer's Guide//.+HTMLPlug processes html files. It extracts //title// metadata based on the <title> tag; other metadata expressed using html's metatag syntax can be extracted too. There are many options available with this plugin, documented in the //Greenstone Digital Library Developer's Guide//.
  
-=== <!-- id:569 -->WORDPlug (*.doc) ===+=== WORDPlug (*.doc) ===
  
-<!-- id:570 -->WORDPlug imports Microsoft Word documents. There are many different variants on the Word format—and even Microsoft programs frequently make conversion errors. Greenstone uses independent programs to convert Word files to html. For some older Word formats the system resorts to a simple extraction algorithm that finds all text strings in the input file.+WORDPlug imports Microsoft Word documents. There are many different variants on the Word format—and even Microsoft programs frequently make conversion errors. Greenstone uses independent programs to convert Word files to html. For some older Word formats the system resorts to a simple extraction algorithm that finds all text strings in the input file.
  
-=== <!-- id:571 -->PDFPlug (*.pdf) ===+=== PDFPlug (*.pdf) ===
  
-<!-- id:572 -->PDFPlug imports documents in PDF Adobe's Portable Document Format. Like WORDPlug, it uses an independent program, in this case //pdftohtml//, to convert PDF files to html.+PDFPlug imports documents in PDF Adobe's Portable Document Format. Like WORDPlug, it uses an independent program, in this case //pdftohtml//, to convert PDF files to html.
  
-<!-- id:573 -->As with WORDPlug, by default collections will display the html equivalent of the file when the user clicks the //document// icon; however, the format strings in the collection configuration file can be adjusted to give the user access to the original PDF file instead, and we recommend that you do this. Again, just replace the //<link> … </link>// tags by //<srclink> … </srclink>// ones.+As with WORDPlug, by default collections will display the html equivalent of the file when the user clicks the //document// icon; however, the format strings in the collection configuration file can be adjusted to give the user access to the original PDF file instead, and we recommend that you do this. Again, just replace the //<link> … </link>// tags by //<srclink> … </srclink>// ones.
  
-<!-- id:574 -->The //pdftohtml// program fails on some PDF files. What happens is that the conversion process takes an exceptionally long time, and often an error message relating to the conversion process appears on the screen. If this occurs, the only solution that we can offer is to remove the offending document from the collection. Also, PDFPlug cannot handle encrypted PDF files.+The //pdftohtml// program fails on some PDF files. What happens is that the conversion process takes an exceptionally long time, and often an error message relating to the conversion process appears on the screen. If this occurs, the only solution that we can offer is to remove the offending document from the collection. Also, PDFPlug cannot handle encrypted PDF files.
  
-=== <!-- id:575 -->PSPlug (*.ps) ===+=== PSPlug (*.ps) ===
  
-<!-- id:576 -->PSPlug imports documents in PostScript. It works best if a standard Linux program, called //ps2ascii//, is already installed on your computer. This is available on most Linux installations, but not on Windows. If this program is not available, PSPlug resorts to a simple text extraction algorithm.+PSPlug imports documents in PostScript. It works best if a standard Linux program, called //ps2ascii//, is already installed on your computer. This is available on most Linux installations, but not on Windows. If this program is not available, PSPlug resorts to a simple text extraction algorithm.
  
-=== <!-- id:577 -->EMAILPlug (*.email) ===+=== EMAILPlug (*.email) ===
  
-<!-- id:578 -->EMAILPlug imports files containing E-mail, and deals with common E-mail formats such as are used by the Netscape, Eudora, and Unix mail readers. Each source document is examined to see if it contains an E-mail, or several E-mails joined together in one file, and if so its contents are processed. The plugin extracts //Subject//, //To//, //From//, and //Date// metadata. However, this plugin does not yet handle MIME-encoded E-mails properly—although legible, they often look rather strange.+EMAILPlug imports files containing E-mail, and deals with common E-mail formats such as are used by the Netscape, Eudora, and Unix mail readers. Each source document is examined to see if it contains an E-mail, or several E-mails joined together in one file, and if so its contents are processed. The plugin extracts //Subject//, //To//, //From//, and //Date// metadata. However, this plugin does not yet handle MIME-encoded E-mails properly—although legible, they often look rather strange.
  
-=== <!-- id:579 -->ZIPPlug (.gz, .z, .tgz, .taz, .bz, .zip, .tar) ===+=== ZIPPlug (.gz, .z, .tgz, .taz, .bz, .zip, .tar) ===
  
-<!-- id:580 -->ZIPPlug plugin handles the following compressed and/or archived input formats : gzip (.//gz//, .//z//, .//tgz//, .//taz//) , bzip (.//bz//) , zip (.//zip//, .//jar//) , and tar (.//tar//). It relies on the programs //gunzip//, //bunzip//, //unzip//, and //tar//, which are standard Linux utilities. ZIPPlug is disabled on Windows computers.+ZIPPlug plugin handles the following compressed and/or archived input formats : gzip (.//gz//, .//z//, .//tgz//, .//taz//) , bzip (.//bz//) , zip (.//zip//, .//jar//) , and tar (.//tar//). It relies on the programs //gunzip//, //bunzip//, //unzip//, and //tar//, which are standard Linux utilities. ZIPPlug is disabled on Windows computers.
  
-===== <!-- id:d001 -->The Depositor =====+===== The Depositor =====
  
-<!-- id:d002 -->The Depositor is another means of adding new content to a digital library. Derived from the [[#the_collector|Collector]], it is specifically aimed to mimic the structured submission workflow of a institutional repository. As such there are several requirements for using the Depositor in your collection:+The Depositor is another means of adding new content to a digital library. Derived from the [[#the_collector|Collector]], it is specifically aimed to mimic the structured submission workflow of a institutional repository. As such there are several requirements for using the Depositor in your collection:
  
-  * <!-- id:d003 -->The collection must use Lucene as the indexer +  * The collection must use Lucene as the indexer 
-  * <!-- id:d004 -->You must have already created the collection via the [[#the_librarian_interface|Librarian's Interface]] or command line scripts +  * You must have already created the collection via the [[#the_librarian_interface|Librarian's Interface]] or command line scripts 
-  * <!-- id:d005 -->You must be serving your collection using Apache or similar web-server (Greenstone's built-in ''local library'' is **not** supported)+  * You must be serving your collection using Apache or similar web-server (Greenstone's built-in ''local library'' is **not** supported)
  
-<!-- id:d006 -->Lucene is required for the indexer so as to provide incremental rebuilding ability - where just the newly uploaded document is added and indexed in the collection.+Lucene is required for the indexer so as to provide incremental rebuilding ability - where just the newly uploaded document is added and indexed in the collection.
  
-==== <!-- id:d007 -->Enable the Depositor ====+==== Enable the Depositor ====
  
-<!-- id:d008 -->To enable the Depositor tool modify ''main.cfg'' (in the ''GSDLHOME/etc'' directory): change<code>depositor disabled</code>to<code>depositor enabled</code>+To enable the Depositor tool modify ''main.cfg'' (in the ''GSDLHOME/etc'' directory): change<code>depositor disabled</code>to<code>depositor enabled</code>
  
-<!-- id:d009 -->Note:  +Note:  
-  * <!-- id:d010 -->You might need to change file permissions for the ''GSDLHOME/tmp'', ''GSDLHOME/collect'', and ''GSDLHOME/collect/your_accessable_collection'' directories so as to allow the web-server to write to them +  * You might need to change file permissions for the ''GSDLHOME/tmp'', ''GSDLHOME/collect'', and ''GSDLHOME/collect/your_accessable_collection'' directories so as to allow the web-server to write to them 
-  * <!-- id:d011 -->You need to be in the '''all-collections-editor''' user group to access the Depositor (or '''colbuilder''' for Greenstone version 2.80 and earlier). Note that the ''admin'' user, created when installing Greenstone, is in this group by default +  * You need to be in the '''all-collections-editor''' user group to access the Depositor (or '''colbuilder''' for Greenstone version 2.80 and earlier). Note that the ''admin'' user, created when installing Greenstone, is in this group by default 
-  * <!-- id:d012 -->Remember, the Depositor only works with the Web server, not the local server+  * Remember, the Depositor only works with the Web server, not the local server
  
-==== <!-- id:d013 -->Use the Depositor ====+==== Use the Depositor ====
  
-  - <!-- id:d014 -->Go to the Greenstone's home page and click the "The Depositor" button. +  - Go to the Greenstone's home page and click the "The Depositor" button. 
-  - <!-- id:d015 -->Sign in to the page +  - Sign in to the page 
-  - <!-- id:d016 -->Select a collection from the collection list +  - Select a collection from the collection list 
-  - <!-- id:d017 -->Fill in the metadata fields +  - Fill in the metadata fields 
-  - <!-- id:d018 -->Click the "Select File" button  +  - Click the "Select File" button  
-  - <!-- id:d019 -->Select the file you want to deposit, then click the "Confirmation" button +  - Select the file you want to deposit, then click the "Confirmation" button 
-  - <!-- id:d020 -->Click the "Deposit Item" button and wait for the process being finished +  - Click the "Deposit Item" button and wait for the process being finished 
-  - <!-- id:d021 -->Try the newly built collection+  - Try the newly built collection
  
-<!-- id:d022 -->Notes: +Notes: 
-  * <!-- id:d023 -->The Depositor uses the Dublin Core metadata set by default. So if the target collection doesn't use DC, it is possible that the new added document(s) will not show up when previewing the collection. For example, when the classifiers are built with other metadata sets. You will need to either configure the Depositor to use the same metadata fields as the other documents in your collection, or extend the classifier and format configurations to include Dublin Core metadata fields +  * The Depositor uses the Dublin Core metadata set by default. So if the target collection doesn't use DC, it is possible that the new added document(s) will not show up when previewing the collection. For example, when the classifiers are built with other metadata sets. You will need to either configure the Depositor to use the same metadata fields as the other documents in your collection, or extend the classifier and format configurations to include Dublin Core metadata fields 
-  * <!-- id:d024 -->If you want to upload more than one file at a time, zip them first. Don't forget to include ZipPlug in your collection's ''config.cfg'' +  * If you want to upload more than one file at a time, zip them first. Don't forget to include ZipPlug in your collection's ''config.cfg'' 
-  * <!-- id:d025 -->You will see "Collection built successfully" message or error messages if something goes wrong+  * You will see "Collection built successfully" message or error messages if something goes wrong
  
-==== <!-- id:d026 -->Configure the Depositor ====+==== Configure the Depositor ====
  
-<!-- id:d027 -->Modify ''depositor.dm''+Modify ''depositor.dm''
  
-<!-- id:d028 -->To make the depositor deposit the item in the collection but not import/build it, edit ''macros/depositor.dm'' and change<code>_laststep_ {bild}</code>to<code>_laststep_ {depositonly}</code>+To make the depositor deposit the item in the collection but not import/build it, edit ''macros/depositor.dm'' and change<code>_laststep_ {bild}</code>to<code>_laststep_ {depositonly}</code>
  
-=== <!-- id:d029 -->Configure the Metadata Fields ===+=== Configure the Metadata Fields ===
  
-<!-- id:d030 -->By default, the Depositor uses three fields (Title, Creator and Description) from the Dublin Core metadata set, but you can easily customize this in the GLI Format panel (from Greenstone version 2.81)+By default, the Depositor uses three fields (Title, Creator and Description) from the Dublin Core metadata set, but you can easily customize this in the GLI Format panel (from Greenstone version 2.81)
  
-  - <!-- id:d031 -->Launch GLI, open the collection you want to customize with. Go to the Format Panel, click the "Depositor Metadata" section in the left section, a list of available metadata fields will appear in the right section. +  - Launch GLI, open the collection you want to customize with. Go to the Format Panel, click the "Depositor Metadata" section in the left section, a list of available metadata fields will appear in the right section. 
-  - <!-- id:d032 -->Select fields that you want to be used in the Depositor. A drop-down list will appear right after the selected element, which is used to specify the text input type for that element in the web page: "text" will display a single line text input whereas "textarea" will display a multi-line input area. Hover the mouse over an element will display a tool-tip describing that element. +  - Select fields that you want to be used in the Depositor. A drop-down list will appear right after the selected element, which is used to specify the text input type for that element in the web page: "text" will display a single line text input whereas "textarea" will display a multi-line input area. Hover the mouse over an element will display a tool-tip describing that element. 
-    * <!-- id:d033 -->It is recommended to select metadata fields that have been used to build classifiers, so that the newly added items can show up when previewing the collection. +    * It is recommended to select metadata fields that have been used to build classifiers, so that the newly added items can show up when previewing the collection. 
-    * <!-- id:d034 -->Please note that at least one metadata element must be selected. If there is only one element left selected in the list, de-select the element will fail and pop up a warning message.+    * Please note that at least one metadata element must be selected. If there is only one element left selected in the list, de-select the element will fail and pop up a warning message.
legacy/manuals/en/user/making_greenstone_collections.txt · Last modified: 2023/03/13 01:46 by 127.0.0.1