en:user:the_collector
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revision | |||
en:user:the_collector [2018/07/31 00:55] – [The Greenstone 2 Collector] kjdon | en:user:the_collector [2023/03/13 01:46] (current) – external edit 127.0.0.1 | ||
---|---|---|---|
Line 1: | Line 1: | ||
+ | |||
+ | |||
+ | |||
====== The Greenstone 2 Collector ====== | ====== The Greenstone 2 Collector ====== | ||
//**The Collector is a deprecated facility for creating collections.**// | //**The Collector is a deprecated facility for creating collections.**// | ||
- | <!-- id:468 -->The Collector is a facility that helps you create new collections, | + | The Collector is a facility that helps you create new collections, |
- | <!-- id:470 -->To access the Collector, click the appropriate link on the digital library home page. | + | To access the Collector, click the appropriate link on the digital library home page. |
- | <!-- id:471 -->In Greenstone, the structure of a particular collection is determined when the collection is set up. This includes such things as the format of the source documents, how they should be displayed on the screen, the source of metadata, what browsing facilities should be provided, what full-text search indexes should be provided, and how the search results should be displayed. Once the collection is in place, it is easy to add new documents to it—so long as they have the same format as the existing documents, and the same type of metadata is provided, in exactly the same way. | + | In Greenstone, the structure of a particular collection is determined when the collection is set up. This includes such things as the format of the source documents, how they should be displayed on the screen, the source of metadata, what browsing facilities should be provided, what full-text search indexes should be provided, and how the search results should be displayed. Once the collection is in place, it is easy to add new documents to it—so long as they have the same format as the existing documents, and the same type of metadata is provided, in exactly the same way. |
- | <!-- id:472 -->The Collector has the following basic functions: | + | The Collector has the following basic functions: |
- | - <!-- id:473 -->create a new collection with the same structure as an existing one; | + | - create a new collection with the same structure as an existing one; |
- | - <!-- id:474 -->create a new collection with a different structure from existing ones; | + | - create a new collection with a different structure from existing ones; |
- | - <!-- id:475 -->add new material to an existing collection; | + | - add new material to an existing collection; |
- | - <!-- id:476 -->modify the structure of an existing collection; | + | - modify the structure of an existing collection; |
- | - <!-- id:477 -->delete a collection; and | + | - delete a collection; and |
- | - <!-- id:478 -->write an existing collection to a self-contained, | + | - write an existing collection to a self-contained, |
- | <!-- id:479 -->You must first decide whether to work with an existing collection or build a new one. The former case covers options 1 and 2 above; the latter covers options 3—6. | + | You must first decide whether to work with an existing collection or build a new one. The former case covers options 1 and 2 above; the latter covers options 3—6. |
===== Logging in ===== | ===== Logging in ===== | ||
- | <!-- id:483 -->Either way it is necessary to log in before proceeding. Note that in general, people use their web browser to access the collection-building facility on a remote computer, and build the collection on that server. Of course, we cannot allow arbitrary people to build collections (for reasons of propriety if nothing else), so Greenstone contains a security system which forces people who want to build collections to log in first. This allows a central system to offer a service to those wishing to build information collections and use that server to make them available to others. Alternatively, | + | Either way it is necessary to log in before proceeding. Note that in general, people use their web browser to access the collection-building facility on a remote computer, and build the collection on that server. Of course, we cannot allow arbitrary people to build collections (for reasons of propriety if nothing else), so Greenstone contains a security system which forces people who want to build collections to log in first. This allows a central system to offer a service to those wishing to build information collections and use that server to make them available to others. Alternatively, |
===== Dialog structure ===== | ===== Dialog structure ===== | ||
- | <!-- id:487 -->Upon completion of login, a page appears showing the sequence of steps that are involved in collection building. They are: | + | Upon completion of login, a page appears showing the sequence of steps that are involved in collection building. They are: |
- | - <!-- id:488 -->Collection information | + | - Collection information |
- | - <!-- id:489 -->Source data | + | - Source data |
- | - <!-- id:490 -->Configuring the collection | + | - Configuring the collection |
- | - <!-- id:491 -->Building the collection | + | - Building the collection |
- | - <!-- id:492 -->Viewing the collection. | + | - Viewing the collection. |
- | <!-- id:493 -->The first step is to specify the collection' | + | The first step is to specify the collection' |
- | <!-- id:494 -->These five steps are displayed as a linear sequence of gray buttons at the bottom of the screen, and at the bottom of all other pages generated by the Collector. This display helps users keep track of where they are in the process. The button that should be clicked to continue the sequence is shown in green. The gray buttons are inactive. The buttons change to yellow as you proceed through the sequence, and the user can return to an earlier step by clicking the corresponding yellow button in the diagram. This display is modeled after the “wizards” that are widely used in commercial software to guide users through the steps involved in installing new software. | + | These five steps are displayed as a linear sequence of gray buttons at the bottom of the screen, and at the bottom of all other pages generated by the Collector. This display helps users keep track of where they are in the process. The button that should be clicked to continue the sequence is shown in green. The gray buttons are inactive. The buttons change to yellow as you proceed through the sequence, and the user can return to an earlier step by clicking the corresponding yellow button in the diagram. This display is modeled after the “wizards” that are widely used in commercial software to guide users through the steps involved in installing new software. |
===== Collection information ===== | ===== Collection information ===== | ||
- | <!-- id:498 -->The next step in the sequence, is collection information. When creating a new collection, it is necessary to enter some information about it: | + | The next step in the sequence, is collection information. When creating a new collection, it is necessary to enter some information about it: |
- | * <!-- id:499 -->title, | + | * title, |
- | * <!-- id:500 -->contact E-mail address, and | + | * contact E-mail address, and |
- | * <!-- id:501 -->brief description. | + | * brief description. |
- | <!-- id:502 -->The collection title is a short phrase used through the digital library to identify the content of the collection. Example titles include //Food and Nutrition Library//, //World Environmental Library//, // | + | The collection title is a short phrase used through the digital library to identify the content of the collection. Example titles include //Food and Nutrition Library//, //World Environmental Library//, // |
===== Source data ===== | ===== Source data ===== | ||
- | <!-- id:507 -->Next, the user specifies the source text that comprises the collection. You may either base your collection on a default structure that is provided, or on the structure of an existing collection. | + | Next, the user specifies the source text that comprises the collection. You may either base your collection on a default structure that is provided, or on the structure of an existing collection. |
- | <!-- id:508 -->If you opt for the default structure, the new collection may contain html documents (files ending in //.htm, .html//), or plain text documents (files ending in //.txt, .text//), Microsoft Word documents (files ending in //.doc//), PDF documents (files ending in //.pdf//) or E-mail documents (files ending in // | + | If you opt for the default structure, the new collection may contain html documents (files ending in //.htm, .html//), or plain text documents (files ending in //.txt, .text//), Microsoft Word documents (files ending in //.doc//), PDF documents (files ending in //.pdf//) or E-mail documents (files ending in // |
- | <!-- id:509 -->If you base your new collection on an existing one, the files in the new collection must be exactly the same type as those used to build the existing one. Note that some collections use non-standard input file formats, while others use metadata specified in auxiliary files. If your new input lacks this information, | + | If you base your new collection on an existing one, the files in the new collection must be exactly the same type as those used to build the existing one. Note that some collections use non-standard input file formats, while others use metadata specified in auxiliary files. If your new input lacks this information, |
- | <!-- id:510 -->Boxes are provided to indicate where the source documents are located: up to three separate input sources can be specified. If you need more, just click the button marked “more sources.” | + | Boxes are provided to indicate where the source documents are located: up to three separate input sources can be specified. If you need more, just click the button marked “more sources.” |
- | <!-- id:511 -->There are three kinds of specification: | + | There are three kinds of specification: |
- | * <!-- id:512 -->a directory name on the Greenstone server system (beginning with “file: | + | * a directory name on the Greenstone server system (beginning with “file: |
- | * <!-- id:513 -->an address beginning with “http: | + | * an address beginning with “http: |
- | * <!-- id:514 -->an address beginning with “ftp: | + | * an address beginning with “ftp: |
- | <!-- id:515 -->If you use // | + | If you use // |
- | <!-- id:516 -->If you use // | + | If you use // |
- | <!-- id:517 -->If you use // | + | If you use // |
- | <!-- id:518 -->You can specify sources of more than one type, for instance, documents taken from a local file system and/or remote web site. | + | You can specify sources of more than one type, for instance, documents taken from a local file system and/or remote web site. |
- | <!-- id:520 -->When you click the //configure collection// | + | When you click the //configure collection// |
- | <!-- id:523 -->Sources might be unavailable because | + | Sources might be unavailable because |
- | * <!-- id:524 -->the file, FTP site or URL does not exist; | + | * the file, FTP site or URL does not exist; |
- | * <!-- id:525 -->you need to dial up your ISP first; | + | * you need to dial up your ISP first; |
- | * <!-- id:526 -->you are trying to access a URL from behind a firewall. | + | * you are trying to access a URL from behind a firewall. |
- | <!-- id:527 -->The last case is potentially the most mysterious. It occurs if you normally have to present a username and password to access the Internet Sometimes it happens that you can see the page from your Web browser if you enter the URL, but the Collector claims that it is unavailable. The explanation is that the page in your browser may be coming from a locally cached copy. Unfortunately, | + | The last case is potentially the most mysterious. It occurs if you normally have to present a username and password to access the Internet Sometimes it happens that you can see the page from your Web browser if you enter the URL, but the Collector claims that it is unavailable. The explanation is that the page in your browser may be coming from a locally cached copy. Unfortunately, |
===== Configuring the collection ===== | ===== Configuring the collection ===== | ||
- | <!-- id:531 -->The construction and presentation of all collections is controlled by specifications in a special collection configuration file (see below). Advanced users may use this page to alter the configuration settings. Most, however, will proceed directly to the final stage. Indeed, if both the //configure collection// | + | The construction and presentation of all collections is controlled by specifications in a special collection configuration file (see below). Advanced users may use this page to alter the configuration settings. Most, however, will proceed directly to the final stage. Indeed, if both the //configure collection// |
===== Building the collection ===== | ===== Building the collection ===== | ||
- | <!-- id:536 -->Up until now, the responses to the dialog have merely been recorded in a temporary file. The building stage is where the action takes place. | + | Up until now, the responses to the dialog have merely been recorded in a temporary file. The building stage is where the action takes place. |
- | <!-- id:537 -->During building, indexes for both browsing and searching are constructed according to instructions in the collection configuration file. The building process takes some time: minutes to hours, depending on the size of the collection and the speed of your computer. Some very large collections take a day or more to build. | + | During building, indexes for both browsing and searching are constructed according to instructions in the collection configuration file. The building process takes some time: minutes to hours, depending on the size of the collection and the speed of your computer. Some very large collections take a day or more to build. |
- | <!-- id:538 -->When you reach this stage in the interaction, | + | When you reach this stage in the interaction, |
- | <!-- id:539 -->Warnings are written if input files or URLs are requested that do not exist, or exist but there is no plugin that can process them, or the plugin cannot find an associated file, such as an image file embedded in a html document. The intention is that you will monitor progress by keeping this window open in your browser. If any errors cause the process to terminate, they are recorded in this status area. | + | Warnings are written if input files or URLs are requested that do not exist, or exist but there is no plugin that can process them, or the plugin cannot find an associated file, such as an image file embedded in a html document. The intention is that you will monitor progress by keeping this window open in your browser. If any errors cause the process to terminate, they are recorded in this status area. |
- | <!-- id:540 -->You can stop the building process at any time by clicking on the //stop building// button. If you leave the web page (and have not cancelled the building process with the //stop building// button), the building operation will continue, and the new collection will be installed when the operation completes. | + | You can stop the building process at any time by clicking on the //stop building// button. If you leave the web page (and have not cancelled the building process with the //stop building// button), the building operation will continue, and the new collection will be installed when the operation completes. |
===== Viewing the collection ===== | ===== Viewing the collection ===== | ||
- | <!-- id:542 -->When the collection is built and installed, the sequence of progress buttons appears, with the View collection button active. This takes the user directly to the newly built collection. | + | When the collection is built and installed, the sequence of progress buttons appears, with the View collection button active. This takes the user directly to the newly built collection. |
- | <!-- id:543 -->Finally, there is a facility for E-mail to be sent to the collection' | + | Finally, there is a facility for E-mail to be sent to the collection' |
===== Working with existing collections ===== | ===== Working with existing collections ===== | ||
- | <!-- id:545 -->When you enter the Collector you have to specify whether you want to create an entirely new collection or work with an existing one, adding data to it or deleting it. By creating all searching and browsing structures automatically from the documents themselves Greenstone makes it easy to add new information to existing collections. Because no links are inserted by hand, when new documents in the same format become available they can be merged into the collection automatically. | + | When you enter the Collector you have to specify whether you want to create an entirely new collection or work with an existing one, adding data to it or deleting it. By creating all searching and browsing structures automatically from the documents themselves Greenstone makes it easy to add new information to existing collections. Because no links are inserted by hand, when new documents in the same format become available they can be merged into the collection automatically. |
- | <!-- id:546 -->To work with an existing collection, you first select the collection from a list that is provided. Some collections are “write protected” and cannot be altered: these ones don't appear in the selection list. With the collection, you can | + | To work with an existing collection, you first select the collection from a list that is provided. Some collections are “write protected” and cannot be altered: these ones don't appear in the selection list. With the collection, you can |
- | * <!-- id:547 -->Add more data and rebuild the collection | + | * Add more data and rebuild the collection |
- | * <!-- id:548 -->Edit the collection configuration file | + | * Edit the collection configuration file |
- | * <!-- id:549 -->Delete the collection entirely | + | * Delete the collection entirely |
- | * <!-- id:550 -->Export the collection to CD-ROM. | + | * Export the collection to CD-ROM. |
==== Add new data ==== | ==== Add new data ==== | ||
- | <!-- id:552 -->The files that you specify will be added to the collection. Make sure that you do not re-specify files that are already in the collection—otherwise two copies will be included. Files are identified by their full pathname, web pages by their absolute web address. You specify directories and files just as you do when building a new collection. | + | The files that you specify will be added to the collection. Make sure that you do not re-specify files that are already in the collection—otherwise two copies will be included. Files are identified by their full pathname, web pages by their absolute web address. You specify directories and files just as you do when building a new collection. |
- | <!-- id:553 -->If you add data to a collection and for some reason the building process fails, the old version of the collection remains unchanged. | + | If you add data to a collection and for some reason the building process fails, the old version of the collection remains unchanged. |
==== Edit configuration file ==== | ==== Edit configuration file ==== | ||
- | <!-- id:555 -->Advanced users can edit the collection configuration file, just as they can when a new collection is built. | + | Advanced users can edit the collection configuration file, just as they can when a new collection is built. |
==== Delete the collection ==== | ==== Delete the collection ==== | ||
- | <!-- id:557 -->You will be asked to confirm whether you really want to delete the collection. Once deleted, Greenstone can not bring the collection back! | + | You will be asked to confirm whether you really want to delete the collection. Once deleted, Greenstone can not bring the collection back! |
==== Export the collection ==== | ==== Export the collection ==== | ||
- | <!-- id:559 -->You can export the collection in a form that allows it to be written to a self-contained, | + | You can export the collection in a form that allows it to be written to a self-contained, |
- | + | ||
- | <!-- id:560 -->When you export the collection, the dialogue informs you of the directory name in which the result has been placed. The entire contents of the directory should be written on to CD-ROM using a standard CD-writing utility. | + | |
- | <!-- id:561 -->The immense variety of different possible Windows configurations has made it difficult for us to test and debug the Greenstone installer under all possible conditions. Although | + | When you export |
+ | The immense variety of different possible Windows configurations has made it difficult for us to test and debug the Greenstone installer under all possible conditions. Although the installer produces CD-ROMs that operate on most Windows systems, it is still under development. If you experience problems and you possess a commercial installation package (e.g. InstallShield), | ||
en/user/the_collector.1532998523.txt.gz · Last modified: 2018/07/31 00:55 (external edit)