User Tools

Site Tools


en:user_advanced:command_line_building

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
en:user_advanced:command_line_building [2016/06/24 05:06] anupamaen:user_advanced:command_line_building [2023/03/13 01:46] (current) – external edit 127.0.0.1
Line 1: Line 1:
-====== Command Line Building ====== 
-<!-- id:26 -->It is possible to create and build collections directly from the command line. This 
-page provides the basic information on building Greenstone collections on the command line. The full instructions 
-are provided for Windows users. If you are on a MacOS/Linux, the steps are the same, but some of the 
-commands themselves are slightly different. These differences are listed in the [[#MacOSX/Linux]] section.  
  
-This page goes through the complete steps for creating a new collection and building it on the command line. 
-However, in some cases you may like to create a collection using the Librarian Interface and just rebuild it on the command line. In this case, you'll need to open the terminal, set up the environment, and then go straight to the building step for your existing collection. 
  
-<TABAREA tabs="Greenstone3,Greenstone2"> 
-<TAB> 
  
-===== Windows ===== +====== Command Line Building ====== 
-==== Open a terminal ==== +It is possible to create and build collections directly from the command lineThis 
-On Windows, there are several different ways to open a DOS terminal (a black console screen known as the DOS Prompt)Do one of the following: +page provides the basic information on building Greenstone collections on the command line
-  * ''Start -> All Programs -> Accessories -> Command Prompt'' +
-  * Under the Start menu, type ''cmd'' into the search box and press Enter +
-  * Hold down your keyboard's Windows key and press the key for letter r. (The Windows key is located between the Ctrl and Alt keys on your keyboard.) In the Run dialog that appears, type ''cmd'' in the textfield and press the OK button.  +
-  * In any Windows Explorer, hold down Shift and right click in an empty area in the window. Select ''Open command window here'' from the menu.+
  
 +The first section shows how to rebuild a collection that has been created and edited in GLI. GLI doesn't do proper incremental building, so for large collections, it may save time to set up a collection using GLI and build it on the command line. 
  
-==== Setup the Environment ====+The second part shows how to create, edit and build a collection entirely using the command line.
  
-In order to build collections in Greenstone (or run any other Greenstone scripts from the  +===== Using GLI to create a collection, then using command line for building =====
-command line), you must first setup the terminal's environment for Greenstone. To do this, first +
-change into the directory where Greenstone has been installed.  +
-Assuming Greenstone was installed in its default location (and your user name was "jsmith"), you can move there by typing:+
  
-<code> +If your collection will grow very largeit will save you time to build it using command line building toolsInitially, using GLI, you want to
-cd C:\Users\jsmith\Greenstone3 +
-</code> +
-//**Note** if the path to your Greenstone installation includes spaces (e.g. Program Files), you **must** +
-put quotations around the pathFor example: ''cd "C:\Program Files\Greenstone3"'' and ''cd "%GSDL3HOME%\sites\localsite\dlpeople"''.//+
  
-Nextat the prompt type:+  * Create a new collection 
 +  * Add a few documents and metadata 
 +  * Configure your collection. What indexesplugin options, classifiers etc do you need? 
 +  * Build it in GLI and preview. Do you need to change configuration settings?
  
-<code> +Once you have the collection set up the way you want, then you can start adding the bulk of your documents. You can do this using GLI. And add metadata using GLI.
-gs3-setup +
-</code>+
  
-This batch file (which you can read if you like) tells the system where to  +When its time to build, you can either build in GLI, or on the command line. Command line build is useful if you want to schedule it for building overnight, for example, or if you want to build incrementally. The sections below detail full build, and incremental build.
-look for Greenstone programs.+
  
-//Note: On Windows 95/98 systems running ''gs3-setup'' +==== Set up Greenstone environment ====
- may fail with an **Out of environment space** error.  +
-If this happens, you should edit your system's ''config.sys'' +
- file (normally found at ''C:\config.sys'') and add the line  +
-''shell=C:\command.com /e:4096 /p'' (where ''C:'' is your system  +
-drive letter) to expand the size of the environment table. You'll +
- need to reboot for this change to take effect, and then repeat the steps above for Greenstone.//+
  
-Iflater on in your interactive session at the DOS prompt +To beginyou will need to open a terminal window (see [[#opening_a_terminal_on_windows | below]]), and set up the Greenstone environment. In the terminalchange directory to the greenstone top level folder. 
-you wish to return to the top level Greenstone directory you can accomplish this by typing  +Run the following command to setup the environment:
-''cd %GSDL3SRCHOME%''+
  
-**//If you close your DOS window and start another one,  +^Greenstone version^Windows^Linux/Mac^ 
-you will need to invoke ''gs3-setup'' again.//**+|2|setup|source setup.bash| 
 +|3|gs3-setup|source gs3-setup.sh|
  
-<!-- id:33 -->Now you are in a position to makebuild and rebuild collections+Noteif you close your terminal window and start another one, you will need to invoke the setup command again.
  
-==== Create a collection ====+==== Build on the command line ====
  
-The first program we will look at is the Perl program ''mkcol.pl'',  +Now you can build the collection.
-whose name stands for “make a collection.” Typing ''perl —S mkcol.pl'' will provide  +
-the full list of options, which you can also view [[script_options#mkcol.pl|here]]+
  
-//(If your Windows environment is set up to associate the Perl application with  +The main command for rebuilding a collection is full-rebuild.pl.
-files ending in ''.pl'', you can leave off the ''perl -S'' for all of these scripts.)//+
  
-To create a new collection: +^Greenstone version^Windows^Linux/Mac^ 
-<code+|2|perl -S full-rebuild.pl <collname>|full-rebuild.pl <collname>| 
-perl -S mkcol.pl [options] collection-name +|3|perl -S full-rebuild.pl -site localsite <collname>|full-rebuild.pl -site localsite <collname>|
-</code>+
  
-<!-- id:34 -->For example, to create a collection named //dlpeople// in ''localsite'+Notes 
-with the creator's email address of //me@cs.waikato.ac.nz//type+  * replace <collnamewithe the short collection identifier. This is the name of the collection's folder in the collect folder. You can also see it in GLI's title bar. It will be in brackets after the collection title. Eg "greenstone demo collection (demo)". In this case, the collname is demo. 
 +  * If you have a custom site for Greenstone 3, replace 'localsite' with your sitename. 
 +  * There are options for full-rebuild.pl. View the list of options by running [perl -S] full-rebuild.pl -h 
 +  * For Linux and MacOS, you can leave off the perl -S for all the perl commands on this pageIf your Windows environment is set up to associate the Perl application with  
 +files ending in ''.pl''you can also leave off ''perl -S'' for Windows too.
  
-<code> +Running full-rebuild.pl will reimport and index all the documentsYou will need to do this if you have changed plugin optionsor other configuration optionsIf the configuration hasn't changed, and you just want to add new documents or update modified documents, then you should use incremental building.
-perl —S mkcol.pl -site localsite —creator [email protected].ac.nz dlpeople +
-</code> +
-\\ +
-//(Since Greenstone3 allows you to have multiple [[en:user:sites]]you must always specify in which site the  +
-collection is inThe default site is called ''localsite''.)//+
  
-<!-- id:36 -->To view the newly created files, move to the newly created  +==== Incremental building ====
-collection directory by typing+
  
-<code> +Incremental building is where you only process the new or changed documents each time you build, thereby speeding up the build process. New and modified documents will be processed, and deleted documents will be removed from the collection. If metadata has changed, then documents will be reprocessed.
-cd %GSDL3HOME%\sites\localsite\collect\dlpeople +
-</code>+
  
 +Important note for collection design: Greenstone can notice that metadata in a folder has been added/changed, but it is not smart enough to tell which documents in the folder the changed metadata
 +belongs to. Therefore, if metadata in a folder has changed (including new metadata being added), then all documents in that folder will be reimported. This means that if you have all your documents in the top level import folder, adding new metadata or changing any metadata for any document will result in all documents being reimported. If you intend to do incremental import, then please organize your documents into subfolders. That way modifying metadata for some documents won't result in all other documents being reimported.
  
-<!-- id:38 -->You can list the contents of this directory by typing ''dir''. +Note 2An empty metadata file in an import folder (including the top level import folder) will trigger a full reimport of all documents in that folder. This is a bug in Greenstone 2.87, 3.08 and earlier. Empty metadata files will automatically get added by GLI. The solution is to add a piece of metadata to a document using the Enrich panel. Just one will do.
- There should be six subdirectories:  +
-  * //etc// +
-  * //images// +
-  * //import// +
-  * //script// +
-  * //style//+
  
-==== Add documents ====+The main command for incremental rebuild is incremental-rebuild.pl. You can use this in place of full-rebuild.pl.
  
-<!-- id:39 -->Now we must populate the collection with sample documentsTo do this,  +^Greenstone version^Windows^Linux/Mac^ 
-we copy documents into the collections ''import'' folder. Assuming your documents are in the folder +|2|perl -S incremental-rebuild.pl <collname>|incremental-rebuild.pl <collname>
-''C:\Users\jsmith\dldocuments'', you can either:+|3|perl -S incremental-rebuild.pl -site localsite <collname>|incremental-rebuild.pl -site localsite <collname>|
  
-<!-- id:40 -->select the contents of the ''dldocuments'' directory +Indexer Noteonly the Lucene and Solr indexers can do incremental indexing. MG and MGPP cannot. If you do incremental-rebuild with MG or MGPP, indexing will be carried out over the entire collection. So we recommend Lucene or Solr if you will be doing incremental building.
-and drag them into the ''dlpeople'' collection's ''import'' directory.+
  
-<!-- id:41 -->Or, you can type the command+===== Finer control of the build process =====
  
-<code> +The build process actually consists of several stages: 
-xcopy /C:\Users\jsmith\dldocuments\import +  * **importing** the original documents into greenstone'XML archive format 
-</code>+  **building** the collection, which includes **indexing** the archive documents and generating a **database** of metadata and classifier structures 
 +  * **activating** the collection in the live library (if necessary)
  
-==== Edit the Config file ====+These stages can all be run separately. Note, the greenstone environment must be set up in any terminal window before you can run these commands.
  
-<!-- id:42 -->In the collection's ''etc'' directory there is file called ''collectionConfig.xml''+==== Importing collection ====
-Any modifications that you can make in the GLI, can also be achieved by manually editing the +
-''collectionConfig.xml'' file. Simply open it using your favorite text editor,  +
-e.g. Notepad or Wordpad, make changes and save it. You can learn more about the Collection configuration file [[en:user:configuration files#Collection configuration files|here]]. +
  
-==== Import the collection ====+This is the process of converting the original documents, which might be a mixture of file types, into a standardised XML based format - the Greenstone archive format. Original source documents live in the import folder of a collection, while the archive documents live in the archives folder.  
  
 +The command to import a collection is ''import.pl''. Type ''perl -S import.pl'' at the prompt to get a list of all the options for the import program, or view them [[script_options#import.pl|here]].
  
-<!-- id:43 -->Now you are ready to “import” the collection.  +^Greenstone version^Import command^ 
-This is the process of bringing the documents into the Greenstone system,  +|2|perl -S import.pl <collname>| 
-standardizing the document format, the way that metadata is specified,  +|3|perl -S import.pl -site localsite <collname>|
-and the file structure in which the documents are stored.  +
-Type ''perl S import.pl'' at the prompt to get a list of all the options for the import program,  +
-or view them [[script_options#import.pl|here]].+
  
-<code> +As before, you need to put in your own collection name, and change the site name if you are using a custom greenstone3 site.
-perl —S import.pl -site localsite dlpeople +
-</code>+
  
-<!-- id:44 -->Don't worry about all the text that scrolls past—it's just reporting  +Don't worry about all the text that scrolls past—it's just reporting the progress of the import. Note that you do not have to be in either the //collect// or //dlpeople// directories when this command is entered; because the Greenstone environment has been set up, the Greenstone software can work out where 
-the progress of the import. Note that you do not have to be in either the  +
-//collect// or //dlpeople// directories when this command is entered; +
- because ''%GSDL3SRCHOME%'' is already set, the Greenstone software can work out where +
 the necessary files are. the necessary files are.
  
-==== Build the collection ====+=== Incremental import ===
  
-<!-- id:49 -->The next phase is to “build” the collection,  +You can run just the import phase incrementally, using ''incremental-import.pl'' in place of ''import.pl''
-which creates all the indexes and files that make the collection work.  + 
-Type ''perl S buildcol.pl'' at the command prompt for a list of +==== Building a collection ==== 
- collection-building options, which are also listed [[script_options#buildcol.pl|here]]. + 
 +The next phase is to “build” the collection, which creates all the indexes and databases that make the collection work.  
 +Type ''perl -S buildcol.pl'' at the command prompt for a list of collection-building options, which are also listed [[script_options#buildcol.pl|here]]. 
 For now, stick to the defaults by typing For now, stick to the defaults by typing
  
-<code+^Greenstone version^Build command^ 
-perl S buildcol.pl -site localsite dlpeople +|2|perl -S buildcol.pl <collname>| 
-</code>+|3|perl -S buildcol.pl -site localsite <collname>|
  
-<!-- id:50 -->Again, don't worry about the “progress report” text that scrolls past.+Again, don't worry about the “progress report” text that scrolls past.
  
 ==== Make the collection live ==== ==== Make the collection live ====
  
 Finally, we need to make the collection "live" by replacing the collection's old ''index'' folder Finally, we need to make the collection "live" by replacing the collection's old ''index'' folder
-with the contents of the ''building'' folder. We can do this in two ways:+with the contents of the ''building'' folder. And for greenstone 3, we need to reload it in the library. We can do this in two ways:
  
-In an explorer window (i.e. outside of the terminal) simply select  +Running activate.pl
-the contents of the //dlpeople// collection's ''building'' +
-directory and drag them into the ''index'' directory.+
  
-<!-- id:53 -->Alternatively, you can remove the ''index'' directory  +^Greenstone version^Activate command^ 
-(and all its contents) by typing the command +|2|perl -S activate.pl <collname>| 
-<code+|3|perl -S activate.pl -site localsite <collname>|
-rd /s index            # on Windows NT/2000 +
-deltree /Y index       # on Windows 95/98 +
-</code>+
  
-<!-- id:54 -->and then change the name of the ''building'' directory to ''index'' with+Or manually: 
 +Delete the index folder, rename building to index, then restart the Greenstone3 server. 
 +Note, the collection lives in the following location:
  
-<code+^Greenstone version^Collection location^ 
-ren building index +|2|path-to-greenstone2/collect/<collname>| 
-</code>+|3|path-to-greenstone3/web/sites/localsite/collect/<collname>|
  
-It is important that these commands are issued from the correct directory  +==== Passing import/buildcol options to rebuild scripts ==== 
-(unlike the Greenstone commands ''mkcol.pl'', ''import.pl'' and ''buildcol.pl''). +Import or buildcol options can be passed to full-rebuild and incremental-rebuild. If the option is shared between import.pl and buildcol.pl then it can appear as is, such as -verbosity 5.  This value will be passed to both programs. If an option is specific to one of the programs in particularthen prefix it with 'import:or 'buildcol:respectivelyas in '-import:OIDtype hash_on_full_filename'.
- If the current working directory is not //dlpeople//type  +
-''cd %GSDL3HOME%\sites\localsite\collect\dlpeople'' before going through the  +
-''rd'', ''ren'' and ''mkdir'' sequence above.+
  
-<!-- id:57 -->If your Greenstone server is already running,  +===== Creating and Editing a Collection on the command line =====
-you should be able to access the newly built collection  +
-from your Greenstone homepage. You will have to reload the page  +
-if you already had it open in your browser, or perhaps even close  +
-the browser and restart it (to prevent caching problems).+
  
 +==== Create a collection ====
  
-<!-- id:59 -->In summary then, the commands typed to produce the //dlpeople// collection are:+To create the ''skeleton'' of a collection we use mkcol.pl. This creates all the folders the collection needs, and sets up a default configuration.  Typing ''perl -S mkcol.pl'' will provide  
 +the full list of options, which you can also view [[script_options#mkcol.pl|here]]. 
  
-<code> +To create a new collection:
-cd C:\Users\jsmith\Greenstone3 # assuming default location +
-gs3-setup +
-perl —S mkcol.pl -site localsite —creator [email protected] dlpeople +
-cd %GSDL3HOME%\sites\localsite\collect\dlpeople +
-xcopy /s C:\Users\jsmith\dldocuments\* import +
-perl —S import.pl -site localsite dlpeople +
-perl —S buildcol.pl -site localsite dlpeople +
-rd /s index           # on Windows NT/2000 +
-deltree /Y index      # on Windows 95/98 +
-ren building index +
-</code>+
  
-<!-- LINUX ###################################################################################-->+^Greenstone version^mkcol command^ 
 +|2|perl -S mkcol.pl [options] <collname>
 +|3|perl -S mkcol.pl -site localsite [options] <collname>|
  
-=====MacOSX/Linux =====+For example, to create a collection named //dlpeople// in ''localsite'' 
 +with the creator's email address of //[email protected]//, type
  
-Running Greenstone from the command line on MacOSX and Linux is very similar to doing it +^Greenstone version^mkcol command^ 
-on a WindowsSome of the commands are just a bit differentFirst change into the directory where Greenstone has been installed.  +|2|perl -S mkcol.pl -creator me@cs.waikato.ac.nz <collname>| 
-For example, if Greenstone is installed under its default name  +|3|perl -S mkcol.pl -site localsite -creator [email protected] <collname>|
-at the top level of your user account you can move there by typing+
  
-<code> 
-cd /home/jsmith/Greenstone3 
-</code> 
-\\ 
-To set up the Greenstone environment: 
-<code> 
-source ./gs3-setup.bash  
-</code> 
  
-\\ +//(Since Greenstone3 allows you to have multiple [[en:user:sites]], you must always specify in which site the collection is inThe default site is called ''localsite''.)//
-To create a collection: +
-<code> +
-mkcol.pl -site localsite —creator [email protected] dlpeople +
-</code> +
-\\ +
-To move to the newly created  +
-collection directory: +
-<code> +
-cd $GSDL3HOME/sites/localsite/collect/dlpeople +
-</code> +
-\\ +
-You can list the contents of this directory by typing ''ls''. In the collection's //etc// directory there is a file called //[[en:user:configuration files#collection configuration files|collect.cfg]]// +
-You can open and edit this using your favorite text editor — emacs is a popular editor on Linux+
  
 +To view the newly created files, move to the newly created 
 +collection directory by typing
  
-To copy the contents of the ''/home/documents/dldocuments''  +^Greenstone version^Windows^Linux/Mac^ 
-directory into the ''GSDL3HOME/sites/localsite/collect/dlpeople/import'' directory. To do this, type the command +|2|cd %GSDL3HOME%\collect\dlpeople|cd $GSDL3HOME/collect/dlpeople| 
-<code> +|3|cd %GSDL3HOME%\sites\localsite\collect\dlpeople|cd $GSDL3HOME/sites/localsite/collect/dlpeople|
-cp —r /home/documents/dldocuments/*   import/ +
-</code> +
-\\+
  
  
-To “import” the collection: 
-<code> 
-import.pl -site localsite dlpeople 
-</code> 
-\\ 
- 
-Next, “build” the collection:  
-<code> 
-buildcol.pl -site localsite dlpeople 
-</code> 
-\\ 
- 
-Finally, make the collection “live” by putting  
-all the material that has just been put in the collection' 
-//building// directory into the //index// directory. First,  
-remove the old index: 
-<code> 
-rm —r index/* 
-</code> 
-//(assuming you are in the ''dlpeople'' directory)// 
- 
-\\ 
-And move the building directory to index: 
-<code> 
-mv building/* index/ 
-</code> 
- 
-\\ 
-In summary then, the commands typed to produced the //dlpeople// collection are: 
-<code> 
-cd /home/jsmith/Greenstone3 # assuming default Greenstone in user directory 
-source ./gs3-setup.bash  
-mkcol.pl —creator [email protected] dlpeople 
-cd $GSDL3HOME/collect/dlpeople 
-cp —r /home/documents/dldocuments/  import/ 
-import.pl -site localsite dlpeople 
-buildcol.pl -site localsite dlpeople 
-rm -r index/* 
-mv building/* index 
-</code> 
- 
- 
- 
-===== Additional Resources ===== 
- 
-While this page only goes through the basics of building collections, there 
-are many other scripts that can be run from the command line  
-(like [[en:user_advanced:command_line_download|downloading]] documents). You can take a look at 
- the [[script_options|scripts and their options]] to get an idea of what else is available. 
-</TAB> 
-<!-- ############################################################################################# 
-############################################################################################# 
-#############################################################################################--> 
-<TAB> 
- 
-===== Windows ===== 
-==== Open a terminal ==== 
-On Windows, there are several different ways to open a DOS terminal (a black console screen known as the DOS Prompt). Do one of the following: 
-  * ''Start -> All Programs -> Accessories -> Command Prompt'' 
-  * Under the Start menu, type ''cmd'' into the search box and press Enter 
-  * Hold down your keyboard's Windows key and press the key for letter r. (The Windows key is located between the Ctrl and Alt keys on your keyboard.) In the Run dialog that appears, type ''cmd'' in the textfield and press the OK button.  
-  * In any Windows Explorer, hold down Shift and right click in an empty area in the window. Select ''Open command window here'' from the menu. 
- 
- 
-==== Setup the Environment ==== 
- 
-In order to build collections in Greenstone (or run any other Greenstone scripts from the  
-command line), you must first setup the terminal's environment for Greenstone. To do this, first 
-change into the directory where Greenstone has been installed.  
-Assuming Greenstone was installed in its default location (and your username is "jsmith"), you can move there by typing: 
- 
-<code> 
-cd C:\Users\jsmith\Greenstone 
-</code> 
-//**Note** if the path to your Greenstone installation includes spaces (e.g. Program Files), you **must** 
-put quotations around the path. For example: ''cd "C:\Program Files\Greenstone"'' and ''cd "%GSDLHOME%\collect\dlpeople"''.// 
- 
-Next, at the prompt type: 
- 
-<code> 
-setup 
-</code> 
- 
-This batch file (which you can read if you like) tells the system where to look for Greenstone programs. 
- 
-//Note: On Windows 95/98 systems running ''setup.bat'' may fail with an **Out of environment space** error. If this happens, you should edit your system's ''config.sys'' file (normally found at ''C:\config.sys'') and add the line ''shell=C:\command.com /e:4096 /p'' (where ''C:'' is your system drive letter) to expand the size of the environment table. You'll need to reboot for this change to take effect, and then repeat the steps above for Greenstone.// 
- 
-If, later on in your interactive session at the DOS prompt,  
-you wish to return to the top level Greenstone directory you can accomplish this by typing  
-''cd %GSDLHOME%'' 
- 
-**//If you close your DOS window and start another one, you will need to invoke ''setup.bat'' again.//** 
- 
-<!-- id:33 -->Now you are in a position to make, build and rebuild collections.  
- 
-==== Create a collection ==== 
- 
-The first program we will look at is the Perl program ''mkcol.pl'',  
-whose name stands for “make a collection.” Typing ''perl —S mkcol.pl'' will provide  
-the full list of options, which you can also view [[script_options#mkcol.pl|here]].  
- 
-//(If your Windows environment is set up to associate the Perl application with  
-files ending in ''.pl'', you can leave off the ''perl -S'' for all of these scripts.)// 
- 
-To create a new collection: 
-<code> 
-perl -S mkcol.pl [options] collection-name 
-</code> 
- 
-<!-- id:34 -->For example, to create a collection named //dlpeople//  
-with the creator's email address of //[email protected]//, type 
- 
-<code> 
-perl —S mkcol.pl —creator [email protected] dlpeople 
-</code> 
-\\ 
-//Please substitute your email address for mine!// 
- 
-<!-- id:36 -->To view the newly created files, move to the newly created  
-collection directory by typing 
- 
-<code> 
-cd %GSDLHOME%\collect\dlpeople 
-</code> 
  
 +You can list the contents of this directory by typing ''dir'' (Windows) or ''ls'' (Linux/Mac).
 + There should be several subdirectories, which differ slightly between Greenstone 2 & 3:
  
-<!-- id:38 -->You can list the contents of this directory by typing //dir//. 
- There should be six subdirectories:  
   * //etc//   * //etc//
   * //images//   * //images//
   * //import//   * //import//
-  * //macros//+  * //macros// (Greenstone2 only)
   * //script//   * //script//
   * //style//   * //style//
  
-==== Add documents ====+==== Add documents and metadata====
  
-<!-- id:39 -->Now we must populate the collection with sample documents. To do this +To add documents into the collection, simply copy them into the import folder. YOu can manually add metadata by creating metadata.xml files or adding metadata databases. See [[en:user:metadata]] and [[en:user_advanced:metadata#specifying_filenames_manually_in_metadataxml]] for more details.
-we copy documents into the collections ''import'' folder. Assuming your documents are in the folder +
-''C:\Users\jsmith\dldocuments'', you can either: +
- +
-<!-- id:40 -->select the contents of the ''dldocuments'' directory +
-and drag them into the ''dlpeople'' collection's ''import'' directory. +
- +
-<!-- id:41 -->Or, you can type the command +
- +
-<code> +
-xcopy /s C:\Users\jsmith\dldocuments\* import +
-</code>+
  
 ==== Edit the Config file ==== ==== Edit the Config file ====
  
-<!-- id:42 -->In the collection's ''etc'' directory there is a file called ''collect.cfg''.  +In the collection's ''etc'' directory there is a configuration file. This is ''collect.cfg'' for Greenstone 2''collectionConfig.xml'' for Greenstone 3. 
-Open it using your favorite text editore.g. Notepad or Wordpad. Any modifications that you  +Any modifications that you can make in the GLI, can also be achieved by manually editing this file. Simply open it using your favorite text editor,  
-can make in the GLI, can also be achieved by manually editing this +e.g. Notepad or Wordpad, make changes and save it. You can learn more about the Collection configuration file [[en:user:configuration files#Collection configuration files|here]]. 
-collection configuration file. Simply open it using your favorite text editor,  +
-e.g. Notepad or Wordpad, make changes and save it. +
-You can learn more about the Collection configuration file [[configuration files#Collection configuration files|here]].+
  
-==== Build the collection ====+==== Build the Collection ====
  
-Building a collection consists of two main stages, importing and building. Importing is the process of bringing the documents into the Greenstone system +Now you can build the collection using the rebuild commandsor using import/buildcolas described in the earlier sections.
-standardizing the document formatthe way that metadata is specified,  +
-and the file structure in which the documents are stored.  +
-The building stage generates the indexes, databases and other auxiliary files that are needed to make the collection work in Greenstone.+
  
-These processes can be run separately, or, in later Greenstone versions, a single script can be run which invokes both processes (see [[#build_the_collection_in_one_easy_step | below]]).+===== Additional information =====
  
-=== Importing === +==== Opening a terminal on Windows ====
-<!-- id:43 --> +
-Type //perl —S import.pl// at the prompt to get a list of all the options for the import program,  +
-or view them [[script_options#import.pl|here]].+
  
-<code> +On Windowsthere are several different ways to open DOS terminal (a black console screen known as the DOS Prompt)Do one of the following
-perl —S import.pl dlpeople +  ''Start -> All Programs -> Accessories -> Command Prompt'' 
-</code> +  * Under the Start menu, type ''cmd'' into the search box and press Enter 
- +  * Hold down your keyboard's Windows key and press the key for letter r(The Windows key is located between the Ctrl and Alt keys on your keyboard.In the Run dialog that appears, type ''cmd'' in the textfield and press the OK button.  
-<!-- id:44 -->Don't worry about all the text that scrolls past—it's just reporting  +  In any Windows Explorerhold down Shift and right click in an empty area in the windowSelect ''Open command window here'' from the menu.
-the progress of the import. Note that you do not have to be in either the  +
-//collect// or //dlpeople// directories when this command is entered; +
- because ''%GSDLHOME%'' is already setthe Greenstone software can work out where  +
-the necessary files are+
- +
-=== Building === +
- +
-<!-- id:49 --> +
-Type ''perl —S buildcol.pl'' at the command prompt for list of +
- collection-building options, which are also listed [[script_options#buildcol.pl|here]].  +
-For now, stick to the defaults by typing: +
- +
-<code> +
-perl —S buildcol.pl dlpeople +
-</code> +
- +
-<!-- id:50 -->Again, don't worry about the “progress report” text that scrolls past. +
- +
-=== Make the collection live === +
- +
-Finally, we need to make the collection "live" by replacing the collection's old ''index'' folder +
-with the contents of the ''building'' folder. We can do this in two ways+
- +
-In an explorer window (i.e. outside of the terminal) simply select  +
-the contents of the //dlpeople// collection'''building'' +
-directory and drag them into the ''index'' directory. +
- +
-<!-- id:53 -->Alternatively, you can remove the ''index'' directory  +
-(and all its contents) by typing the command +
-<code> +
-rd /s index            # on Windows NT/2000 +
-deltree /Y index       # on Windows 95/98 +
-</code> +
- +
-<!-- id:54 -->and then change the name of the ''building'' directory to ''index'' with +
- +
-<code> +
-ren building index +
-</code> +
- +
-It is important that these commands are issued from the correct directory  +
-(unlike the Greenstone commands ''mkcol.pl'', ''import.pl'' and ''buildcol.pl''). +
- If the current working directory is not //dlpeople//, type  +
-''cd %GSDLHOME%\collect\dlpeople'' before going through the  +
-''rd'', ''ren'' and ''mkdir'' sequence above. +
- +
-<!-- id:57 -->If your Greenstone server is already running, you should be able to access the newly built collection  +
-from your Greenstone homepage. You will have to reload the page  +
-if you already had it open in your browser, or perhaps even close  +
-the browser and restart it (to prevent caching problems). Alternatively, +
- if you are using the “local library” version of Greenstone you  +
-will have to restart the library program.  +
- +
-==== Build the collection in one easy step ==== +
- +
-An alternative to running import, then build, then deleting the old index and renaming building to index, is to run a single command, full-rebuild.pl. +
- +
-<code> +
-perl -S full-rebuild.pl dlpeople +
-</code> +
- +
-This will run import.pl, buildcol.pl and then remove the old indexes and copy the new ones into the index folder. +
- +
-Import or buildcol options can be passed to full-rebuild. If the option is shared between import.pl and buildcol.pl then it can appear as is, such as -verbosity 5.  This value will be passed to both programs. If an option is specific to one of the programs in particular, then prefix it with 'import:' or 'buildcol:' respectively, as in '-import:OIDtype hash_on_full_filename' +
-        +
-Remember, you can run 'perl -S import.pl' or 'perl -S buildcol.pl' from the command line with no arguments to see the specific options they take. +
- +
-==== Summary ==== +
-<!-- id:59 -->In summary then, the commands typed to produce the //dlpeople// collection are: +
- +
-To set up the collection: +
-<code> +
-cd C:\Users\jsmith\Greenstone # assuming default location +
-setup.bat +
-perl —S mkcol.pl —creator [email protected] dlpeople +
-cd %GSDLHOME%\collect\dlpeople +
-xcopy   /  d:\collect\dlpeople\*   import # assuming D drive +
-</code> +
- +
-To build the collection: +
-<code> +
-perl -S full-rebuild.pl dlpeople +
-</code> +
-or +
-<code> +
-perl —S import.pl dlpeople +
-perl —S buildcol.pl dlpeople +
-rd /s index           # on Windows NT/2000 +
-deltree /Y index      # on Windows 95/98 +
-ren building index +
-</code> +
- +
-=====MacOSX/Linux ===== +
- +
-Running Greenstone from the command line on MacOSX and Linux is very similar to doing it +
-on a Windows. Some of the commands are just a bit differentPlease read through the Windows section for more information about the steps mentioned here. +
- +
-First change into the directory where Greenstone has been installed.  +
-For example, if Greenstone is installed under its default name  +
-at the top level of your user account you can move there by typing +
- +
-<code> +
-cd /home/jsmith/Greenstone +
-</code> +
-\\ +
-To set up the Greenstone environment: +
-<code> +
-source ./setup.bash  +
-</code> +
-//If you are unsure of the shell type you are using, enter ''echo $0'' at your  +
-command-line prompt —it will print out the sought information.  +
-If you are using a different shell contact your system administrator for advice.// +
- +
-\\ +
-To create a collection: +
-<code> +
-mkcol.pl —creator [email protected] dlpeople +
-</code> +
-\\ +
-To move to the newly created  +
-collection directory: +
-<code> +
-cd $GSDLHOME/collect/dlpeople +
-</code> +
-\\ +
-You can list the contents of this directory by typing ''ls''. In the collection's //etc// directory there is a file called //[[configuration files#collection configuration files|collect.cfg]]//.  +
-You can open and edit this using your favorite text editor — emacs is a popular editor on Linux.  +
- +
- +
-To copy the contents of the ''/home/documents/dldocuments''  +
-directory into the ''GSDLHOME/collect/dlpeople/import'' directory. To do this, type the command +
-<code> +
-cp —r /home/documents/dldocuments/  import/ +
-</code> +
-\\ +
- +
-To build the collection in one step: +
-<code> +
-full-rebuild.pl dlpeople +
-</code> +
- +
-Or, to build it step by step manually: +
- +
-To “import” the collection: +
-<code> +
-import.pl dlpeople +
-</code> +
-\\ +
- +
-Next, “build” the collection:  +
-<code> +
-buildcol.pl dlpeople +
-</code> +
-\\ +
- +
-Finally, make the collection “live” by putting  +
-all the material that has just been put in the collection's  +
-//building// directory into the //index// directory. First,  +
-remove the old index: +
-<code> +
-rm —r index/* +
-</code> +
-//(assuming you are in the ''dlpeople'' directory)// +
- +
-\\ +
-And move the building directory to index: +
-<code> +
-mv building/* index/ +
-</code> +
- +
-\\ +
-In summary then, the commands typed to produced the //dlpeople// collection are: +
-<code> +
-cd /home/jsmith/Greenstone # assuming default Greenstone in user directory +
-source ./setup.bash  +
-mkcol.pl —creator [email protected] dlpeople +
-cd $GSDLHOME/collect/dlpeople +
-cp —r /home/documents/dldocuments/  import/ +
-</code> +
-To build the collection: +
-<code> +
-full-rebuild.pl dlpeople +
-</code> +
-or +
-<code> +
-import.pl dlpeople +
-buildcol.pl dlpeople +
-rm -r index/* +
-mv building/* index +
-</code> +
- +
- +
-===== Incremental Building ===== +
- +
-Incremental building is where you only process the new or changed documents each time you build, thereby speeding up the build process+
- +
-**Incremental importing**: New documents will be imported. Modified documents will be re-imported. Deleted documents will be removed from the collection. If metadata has changedthen documents will be reimported. +
- +
-Important note for collection design: Greenstone can notice that metadata in a folder has been added/changed, but it is not smart enough to tell which documents in the folder the changed metadata +
-belongs toTherefore, if metadata in a folder has changed (including new metadata being added), then all documents in that folder will be reimported. This means that if you have all your documents in the top level import folder, adding new metadata or changing any metadata for any document will result in all documents being reimported. If you intend to do incremental import, then please organize your documents into subfolders. That way modifying metadata for some documents won't result in all other documents being reimported. +
- +
-**Incremental indexing**: Currently only the Lucene indexer (and Solr indexer included with Greenstone 3) can do incremental indexing. If you are using MG/MGPP then a full buildcol pass will be done, even if incremental-buildcol.pl is used. +
- +
-If collection design has changed, then you will need to do a full rebuild. Changes to plugin options, and some import options will necessitate a full import. Changes to search indexes, partition indexes, browsing classifiers will necessitate a full buildcol.  +
- +
-If you are doing incremental building, a full rebuild every now and then can be a good idea, in case something hasn't gone quite right in the incremental process. If you notice anything weird after an incremental build, then a full rebuild is a good idea then too. +
- +
-On the command line, you can run building/importing incrementally by using the scripts incremental-rebuild.pl, incremental-import.pl and incremental-buildcol.pl instead of full-rebuild.pl, import.pl and buildcol.pl, respectively. +
- +
-Note that running incremental-buildcol.pl when you are not using Lucene for your indexer will be the same as running buildcol.pl. Without any -builddir option, incremental-buildcol.pl will do the indexing into the existing index directory, so you don't need to rename building to index.+
  
 ===== Additional Resources ===== ===== Additional Resources =====
Line 642: Line 204:
 (like [[en:user_advanced:command_line_download|downloading]] documents). You can take a look at (like [[en:user_advanced:command_line_download|downloading]] documents). You can take a look at
  the [[script_options|scripts and their options]] to get an idea of what else is available.  the [[script_options|scripts and their options]] to get an idea of what else is available.
-</TAB> 
-</TABAREA> 
- 
  
en/user_advanced/command_line_building.1466744773.txt.gz · Last modified: 2016/06/24 05:06 by anupama