Greenstone tutorial exercise
Downloading over OAI
The previous exercise did not obtain the data from an external OAI-PMH server. This missing step is accomplished either by running a command-line program or by using the Download panel in the Librarian Interface. This exercise shows you how to do this using both methods.
Downloading using the Librarian Interface
-
In the Librarian Interface, switch to the Download panel. Select OAI from the list of download types on the left hand side.
-
In the url box, type in the following URL:
-
We want to download the documents as well as the metadata, so tick the get_doc checkbox.
-
If your computer is behind a firewall or proxy server, you will need to edit the proxy settings in the Librarian Interface. Click the <Preferences...> button. Switch on the Use proxy connection? checkbox. Enter the proxy server address and port number in the Proxy Host: and Proxy Port: boxes. Click <OK>.
-
Now click <Download>. If you have set proxy information in Preferences..., a popup will ask for your user name and password. Once the download has started, a progress bar appears in the lower half of the panel that reports on how the downloading process is doing.
-
Downloaded files are stored in a top-level folder called Downloaded Files that appears on the left-hand side of the Gather panel. These files can then be added to a collection.
Downloading using the command line
For command line downloading to work, your computer must have a direct connection to the Internet—being behind a firewall may interfere with the ability to download the information. You will need to use the Librarian Interface for downloading if you are behind a firewall.
-
Close the Librarian Interface.
We will work with the OAI collection used in exercise Open Archives Initiative (OAI) collection. You may have noticed that its internal name is oaiservi.
-
In a text editor (e.g. WordPad), open the collection's configuration file, which is in Greenstone → collect → oaiservi → etc → collect.cfg. Add the following line (all on one line):
acquire OAI -src http://rocky.dlib.vt.edu/~jcdlpix/cgi-bin/OAI/jcdlpix.pl -getdoc
Although the position of this line is not critical, we recommend that you place it near the beginning of the file, after the public and creator lines but before the index line. Save the file and quit the editor.
-
Delete the contents of the collection's import folder. This contains the canned version of the collection files, put there during the previous exercise. Now we want to witness the data arriving anew from the external OAI server.
-
Open a DOS window to access the command-line prompt. This facility should be located somewhere within your Start → Programs menu, but details vary between different Windows systems. If you cannot locate it, select Start → Run and enter cmd in the popup window that appears.
-
In the DOS window, move to the home directory where you installed Greenstone. This is accomplished by something like:
cd C:\Program Files\Greenstone
-
Type:
setup.bat
to set up the ability to run Greenstone command-line programs.
-
Change directory into the folder containing the OAI Services Provider collection you built in the last exercise.
cd collect\oaiservi
Even though the collection name used capital letters the directory generated by the Librarian Interface is all lowercase.
-
Run:
perl -S importfrom.pl oaiservi
Greenstone will immediately set to work and generate a stream of diagnostic output. The importfrom.pl program connects to the OAI data provider specified in collection configuration file (it does this for each "acquire" line in the file) and exports all the records on that site.
-
The downloaded files are saved in the collection's import folder. Once the command is finished, everything is in place and the collection is ready to be built. Confirm you have successfully acquired the OAI records by rebuilding the collection.