en:plugin:unknownconverterplugin
Differences
This shows you the differences between two versions of the page.
Next revision | Previous revision | ||
en:plugin:unknownconverterplugin [2020/08/07 07:41] – created anupama | en:plugin:unknownconverterplugin [2023/03/13 01:46] (current) – external edit 127.0.0.1 | ||
---|---|---|---|
Line 1: | Line 1: | ||
+ | |||
+ | |||
+ | |||
====== The UnknownConverterPlugin ====== | ====== The UnknownConverterPlugin ====== | ||
Line 9: | Line 12: | ||
Apache Tika is Apache' | Apache Tika is Apache' | ||
- | All that's necessary is to drop an Apache-Tika jar file into your '' | + | The steps involve putting a JRE 8 into your Greenstone 3, drop an Apache-Tika jar file into your '' |
+ | |||
+ | ==== Steps for users of Greenstone 3 versions after 3.10 ==== | ||
+ | Your Greenstone 3, whether running on Windows or Unix systems, is ready to process docx files out of the box. | ||
+ | |||
+ | Run GLI, drag and drop docx files into your collection and after building, full text searching for your docx files will be available. | ||
+ | |||
+ | ==== Steps for 3.10 users ==== | ||
+ | 1. [[http:// | ||
+ | |||
+ | Linux users can now start up GLI, drag and drop docx files into a collection. After building, your collection will have full text search for your docx files. | ||
+ | |||
+ | 2. **Extra step for Windows users: | ||
+ | Use a text editor to edit ''< | ||
+ | Locate the line that says: | ||
+ | < | ||
+ | and change it to say: | ||
+ | < | ||
+ | Save the '' | ||
+ | |||
+ | Now you can run GLI, drag and drop docx files into your collections and after building you'll now have full text search for your docx files. | ||
+ | |||
+ | |||
+ | ==== Steps for 3.09 users ==== | ||
**The UnknownConverterPlugin has been officially available since Greenstone 3.09, so that 3.09 users can also start using Tika with the plugin, by** | **The UnknownConverterPlugin has been officially available since Greenstone 3.09, so that 3.09 users can also start using Tika with the plugin, by** | ||
+ | |||
+ | |||
+ | 0. following the quick steps [[# | ||
1. creating a subfolder called " | 1. creating a subfolder called " | ||
Line 17: | Line 46: | ||
2. downloading the Apache-Tika binary jar file from https:// | 2. downloading the Apache-Tika binary jar file from https:// | ||
- | 3. and then configuring an UnknownConverterPlugin instance for any collection that needs docx processing as follows: | + | 3. and then configuring an UnknownConverterPlugin instance for any collection that needs docx processing as follows. Note that **windows users** need to type '' |
- | {{ : | + | {{ : |
**All 3 of the above steps are already setup for you in the GS3 binaries generated every night** and available from http:// | **All 3 of the above steps are already setup for you in the GS3 binaries generated every night** and available from http:// | ||
Line 28: | Line 57: | ||
For every doctype to be processed by UnknownConverterPlugin, | For every doctype to be processed by UnknownConverterPlugin, | ||
+ | |||
+ | |||
+ | ===== Download JRE 8 and install locally into your GS3 ===== | ||
+ | GS3 comes bundled with JRE 7, but the bundled '' | ||
+ | |||
+ | **For Windows users:** | ||
+ | |||
+ | a. Use a File Explorer to do the following on the file system: | ||
+ | * Rename ''< | ||
+ | * Create folder ''< | ||
+ | |||
+ | b. Visit: https:// | ||
+ | |||
+ | c. Click the " | ||
+ | |||
+ | //It has to be the 32 bit, don't get the 64 bit as then MG/MGPP indexers and GDBM will not work without manually recompiling Greenstone 3.// | ||
+ | |||
+ | d. Then run the JRE windows installer and at the start of the installer, **ensure you tick " | ||
+ | * Set the destination folder to ''< | ||
+ | * Run through the installer | ||
+ | |||
+ | The above steps will have put a compatible JRE8 into ''< | ||
+ | |||
+ | |||
+ | **For Linux users:** | ||
+ | |||
+ | a. Rename ''< | ||
+ | |||
+ | b. Visit: https:// | ||
+ | |||
+ | c. Click the "Linux x64" link, which is the Java 8 update 301 for Linux x64. | ||
+ | |||
+ | d. Put the downloaded tar.gz into the ''< | ||
+ | Decompress. | ||
+ | |||
+ | You may now have ended up with a decompressed folder like '' | ||
+ | |||
+ | Move any '' | ||
+ | |||
+ | Then rename the '' | ||
+ | |||
+ | You want to end up with this file structure: ''< | ||
en/plugin/unknownconverterplugin.1596786109.txt.gz · Last modified: 2020/08/07 07:41 by anupama