Table of Contents
Security in Greenstone Collections
Greenstone software comes equipped with a system for registering and administering users. Greenstone users can register as user with a login and password. Administrators can then assign them into various groups. For user management information, see Greenstone 3 User Management or Greenstone 2 User Management
Once groups have been set up, access to collections or sets of documents in a collection can be restricted to certain groups.
Collections can be protected at the collection level, or at the document level. A simple mechanism is just to 'hide' the collection, by not linking to it from the home page. Users that know of the collection can type in the url to it, and from there, they have full access. A second collection level protection is to make the whole collection only accessible to users in certain groups. Document level protection allows documents to be generally accessible/not accessible by default, with exceptions that are password protected/accessible to anyone, depending on the default setting.
Important Note
- Most of these protection mechanisms are not available in GLI, but require you to edit the collection's configuration file directly. Please make sure that GLI does not have your collection open before modifying the configuration file, otherwise GLI will overwrite your changes when it saves the file.
- In Greenstone 3 from v3.08 onward, you can use GLI's Edit > Edit collectionConfig.xml to make these edits without having to close GLI. From 3.10, this will also be functional in client-GLI.
Greenstone3
A collection's configuration file is called 'collectionConfig.xml' and can be found in the collection's etc folder. This can be found at <greenstone home folder>/web/sites/localsite/collect/<collname>/etc/collectionConfig.xml
.
Greenstone2
A collection's configuration file is called 'collect.cfg' and can be found in the collection's etc folder. This can be found at <greenstone home folder>/collect/<collname>/etc/collect.cfg
.
Hiding a collection
Collections can be hidden from general users by not including a link to them on the home page. We call this making a collection 'private' instead of 'public'. This can be done via GLI. Open the collection in GLI, and go to the 'General' page of the 'Format' panel. Deselecting "This collection should be publically accessible" will hide it from the home page of the library.
Greenstone3
To make this change in the configuration file directly, open up the collection's collectionConfig.xml file (see here), and set the 'public' metadata element value to 'false'.
<metadata name="public">false</metadata>
Greenstone2
To make this change in the configuration file directly, open up the collection's collect.cfg file (see here), and set the 'public' field to 'false'.
public false
Collection Level Protection
Collection level protection involves setting the collection to be private, and then defining which groups of users are allowed access. The collection will appear on the home page of the library, but when a user clicks on the link to the collection, they will be prompted for a username and password. To gain access, the user will need to have previously registered with a username and password. An administrator wil have had to assign this user to one (or more) of the groups that is allowed to access this collection. If that is done, then the user can log in with their username and password, and they will be allowed into the collection. Once in, they have free access to any of the documents in it.
Collection level protection can not be done via GLI, but must be done by editing the collection's config file.
Greenstone3
Open up collectionConfig.xml (see here). Add a <security>
element as a child element of <CollectionConfig>
. It doesn't matter where in the file it goes, as long as it is a child of <CollectionConfig>
, and not inside any other element. The following is an example security block.
<security scope="collection" default_access="private"> <exception> <group name="dl"/> </exception> </security>
This restricts access to all users except those who are part of the dl group. There can be more than one group element in an exception element, and more than one exception in a security element.
Greenstone2
Open up collect.cfg (see here). Add lines like the following:
authenticate collection auth_groups dl
This restricts access to all users except those who are part of the dl group. You can have one or more groups in the auth_groups line. Separate them by space.
Document Level Protection
You can also restrict access to only certain documents in a collection. In this case, the collection as a whole is marked public/private, and then groups of documents are marked as exceptions to that rule. Again, group information is used to determine who has access to the private parts of the collection. Private documents will appear in the collection in search results and browsing lists, but access to the content will be restricted.
Individual documents are specified using their OIDs. These are the ex.Identifier values they get given when the collection is built. Depending on which method of assigning OID's is used, these may change between builds. You must use a stable identifier when protecting documents. The best one to use is assigned identifiers, then you know what the identifier will be before the collection is built. See this page for more information about Greenstone identifiers and methods of assigning them.
Remember, you can't have the collection open in GLI while you are editing the configuration file. If you need to use GLI to find out document OIDs, then make a note of the OIDs while you have GLI open, then close the collection before writing the OIDs into the configuration file.
Greenstone3
Open up collectionConfig.xml (see here). Add a <security>
block as a child of the <CollectionConfig>
element. The security element will look like
<security scope="document" default_access="public|private">
If default_access is "public" then documents will be freely accessible unless covered by the exception rules, and if default_access is set to "private", then documents will be restricted by default.
Exceptions provide alternative rules for sets of documents. An exception looks like:
<exception> <documentSet name="X"/> <group name="dl"/> </exception>
There can be more than one documentSet and more than one group per exception. Document sets define which documents match this exception, and groups define which groups of users can access these documents.
Document sets are defined using a <documentSet> element.
<documentSet name="X"> <match>HASH1234</match> <match field="oid|meta elem" type="match|regex">match string</match> </documentSet>
The name is a name given to the set, and needs to be used by the exception to refer to this set. Matches give rules about which documents match. A match element with no attributes is equivalent to <match field="oid" type="match">
and will match a document with the specified oid. The field element is either "oid" or a metadata element name (e.g. dc.Subject, Title). The type attribute is either "match" for an exact match, or "regex" for a regular expression match.
An exception with groups but no documentSets provides the groups that all documents are restricted to when default_access is private. To allow public access to a few documents, add a documentSet for those documents, and an exception that uses that set, and with a group of "".
Note: unlike in Greenstone 2, links to the original file, eg the PDF version of the document, are covered by the security rules.
Note: exceptions with documentSets make no sense if the security scope is collection. In this case, they will be ignored.
Some examples:
- All documents are freely accessible, except for the ones published by BOSTID and two specified documents, which are available to users in group X. (Note, while I have used pretend HASH ids for the specified documents, it is better to use a stable identifier, see here.)
<security scope="document" default_access="public"> <exception> <documentSet name="bostid"/> <documentSet name="hidden-docs"/> <group name="X"/> </exception> <documentSet name="bostid"> <match type="match" field="dc.Publisher">BOSTID</match> </documentSet> <documentSet name="hidden-docs"> <match>HASH1234</match> <match>HASH4567</match> </documentSet> </security>
- All documents are restricted to users in 'staff' or 'admin' groups.
<security scope="document" default_access="private"> <exception> <group name="staff"/> <group name="admin"/> </exception> </security>
- All documents are restricted to the 'XX' group, except for those whose dc.Title starts with 'Egypt', which are publically accessible.
<security scope="document" default_access="private"> <exception> <group name="XX"/> <exception> <exception> <documentSet name="egypt"/> <group name=""/> <!-- an empty group name means that anyone can access these docs --> </exception> <documentSet name="egypt"> <match field="dc.Title" type="regex">Egypt.*</match> </documentSet> </security>
Greenstone2
Open up collect.cfg (see here). To make most of the documents freely accessible, with a few documents restricted, add lines like the following:
authenticate document auth_groups dl private_documents oid1 oid2...
To make most (or all) of the documents restricted, with a few (or none) freely accessible, add lines like the following:
authenticate document auth_groups dl public_documents oid1 oid2...
More than one group can be specified, separate them by space. The list of documents is also separated by space.
Note, in Greenstone 2, document level restrictions only work with the Greenstone version of the document, i.e. the page you get to using [link][/link]. Links to the original verison (using [srclink][/srclink]), e.g. to the PDF file, are not covered by the security system. If you want to protect documents, then you mustn't use [srclink] in search results or browsing classifier format statements. You can add a link to the original file from the document page, if you want authorised users to have access to it.
Collection vs Document level protection
If you have a simple case where all documents in a collection are restricted to users in group X, then you have a choice about whether to protect the collection as a whole, or just protect the documents.
Protecting the documents will mean that non-authorised users can visit the collection, browse the classifiers, and search for documents. However, they will not be able to view the documents themselves.
Protecting the collection will mean that non-authorised users cannot even visit the collection.
Greenstone3
The <security>
blocks for the two options are as follows.
- Restricting the entire collection to group X:
<security scope="collection" default_access="private"> <exception> <group name="X"/> </exception> </security>
- Restricting just the documents to group X:
<security scope="document" default_access="private"> <exception> <group name="X"/> </exception> </security>
Greenstone2
The code for the two options are as follows.
- Restricting the entire collection to group X:
authenticate collection auth_groups X
- Restricting just the documents to group X:
authenticate document auth_groups X
Additional Resources
- User administration and collection authentication document from Greenstone Support for South Asia. This is a more descriptive explanation, but only covers Greenstone 2 and may be slightly out of date.