User Tools

Site Tools


en:user:gs2_to_gs3

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
en:user:gs2_to_gs3 [2026/04/27 23:35] – [Part 1: apache rewrite] kjdonen:user:gs2_to_gs3 [2026/04/27 23:57] (current) – [Part 1: apache rewrite] kjdon
Line 250: Line 250:
   * If this apache is not on the same server as the running Greenstone 3, then change localhost:8383 to the public URL of the gs3 library.   * If this apache is not on the same server as the running Greenstone 3, then change localhost:8383 to the public URL of the gs3 library.
  
 +==== Part 2: Get matching doc ids ====
  
 +If you are using HASH ids, then these can change between greenstone versions. Tip - if you want external links into your library, use an OID type that won't change. eg hash_on_full_filename, assigned, filename, dirname, full_filename. 
  
 +If the Greenstone 2 used hash ids, when you rebuild in greenstone 3, then hash ids will change. and then the redirects won't work.
 +
 +Some options to try:
 +=== Don't rebuild ===
 +
 +Copy the collection over to greenstone3, open it in GLI to convert the config files - format statements may need tweaking - then see if it works in the library without rebuilding. The HASH ids won't have changed, but whether the collection works properly will depend on the gap between versions. A downside to this way is that you can never rebuild the collection.
 +
 +=== Use archives as import ===
 +If you have the archives folder available in the greenstone 2 collection, you could use this as the import for the greenstone 3 collection.
 +Set up the collection so that it looks right in Greenstone 3 - either by starting from scratch, or by copying over the old collection as above.
 +Copy all the HASHxx folders from the greenstone 2 collection's archives folder into the greenstone3 collection's import folder. You don't want archiveinf-* files, rss.items, earliestDatestamp files.
 +Then do a full rebuild.
 +
 +=== Extract ids and add as metadata ===
 +
 +If you no longer have the archives folder in the Greenstone 2 collection, then your task is a bit more complicated. 
 +What might work is:
 +1. run db2txt.pl over the gdbm metadata database in the greenstone 2 collection and save to a file, eg: db2txt collect/demo/index/text/demo.gdb > db.out (If the Db is .jdb use jdb2txt.pl instead)
 +2. Write a script to extract filenames and identifiers from the output. Probably into a CSV file would be easiest:
 +<code>
 +Filename,prev.Identifier
 +sample.pdf,HASH01e86960c45a06eaa801e869
 +</code>
 +3. Put the source documents into the Greenstone 3 collection, add the csv file. Add CSVPlugin to process the csv file. Then use -OIDtype assigned, and -OIDmetadata prev.Identifier import options to get it to use these identifiers as the new doc ids.
 +
 +This relies on no subfolders in import.
en/user/gs2_to_gs3.txt · Last modified: 2026/04/27 23:57 by kjdon