Advanced Tasks

How can I obtain a list of all subject headings in my database?
Diego Spano explains:

There is no direct way to do that, but there is a workaround. Run the following commands in your terminal:

1. To set up the GS environment, type: . ./setup.bash

2. Then type: perl -S buildcol.pl -store_metadata_coverage your_collection_name This will create a building folder inside your collection and there you will find a folder named text. Inside you will have a file, "your_collection_name.gdb". This is a database file that will contain all the metadata assigned to each document.

Now, you have to export this file to a txt format with the command db2txt:

3. Move to the folder that has the gdb file, by typing: cd /greenstone/collect/your collection/building/text"

4. Next, run: db2txt your_collection_name.gdb > meta.txt

Now you have a file named meta.txt (just plain text) in the same folder of your gdb file. Open it and take a look. You will have a list of metadata for each document.

5. Now you can "Filter" it with grep command (you have it on Linux but in Windows you can use Cygwin): grep "" ./meta.txt > onlysubject.txt

Now you have onlysubject.txt file with all the values (not unique values).