en:user_advanced:ice_cite
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revision | Last revisionBoth sides next revision | ||
en:user_advanced:ice_cite [2019/03/13 05:57] – anupama | en:user_advanced:ice_cite [2019/04/24 09:29] – anupama | ||
---|---|---|---|
Line 42: | Line 42: | ||
- Select the **UnknownConverterPlugin** in the list of plugins and keep pressing the **<Move Up>** button to shift it upwards, until it appears in the plugin pipeline above the existing **PDFPlugin**, | - Select the **UnknownConverterPlugin** in the list of plugins and keep pressing the **<Move Up>** button to shift it upwards, until it appears in the plugin pipeline above the existing **PDFPlugin**, | ||
- Move to the **Create** pane and build the collection. Once more, when Icecite conversion utility is called by Greenstone' | - Move to the **Create** pane and build the collection. Once more, when Icecite conversion utility is called by Greenstone' | ||
+ | |||
+ | |||
+ | <!-- | ||
+ | USING THE ICECITE TOOL TO CONVERT FROM PDF TO TXT | ||
+ | |||
+ | 1. Need Java 8 for compiling and probably also for running Icecite | ||
+ | < | ||
+ | export JAVA_HOME=/ | ||
+ | export PATH=$JAVA_HOME/ | ||
+ | </ | ||
+ | |||
+ | 2. Get and compile icecite, following the instructions at https:// | ||
+ | < | ||
+ | git clone https:// | ||
+ | cd icecite | ||
+ | git pull --recurse-submodules | ||
+ | cd pdf-parent/ | ||
+ | mvn install | ||
+ | </ | ||
+ | |||
+ | 3. Run icecite, general instructions at https:// | ||
+ | < | ||
+ | cd ../../ | ||
+ | cd icecite/ | ||
+ | java -jar target/ | ||
+ | </ | ||
+ | Examples: | ||
+ | greenstone@bedrock: | ||
+ | |||
+ | greenstone@bedrock: | ||
+ | |||
+ | greenstone@bedrock: | ||
+ | |||
+ | (Also tried with input file pdf01.pdf from the Reports collection) | ||
+ | |||
+ | |||
+ | 4. If you see the exception | ||
+ | --- | ||
+ | Exception in thread " | ||
+ | at org.apache.pdfbox.pdmodel.encryption.PDEncryption.< | ||
+ | at org.apache.pdfbox.pdfparser.PDFParser.prepareDecryption(PDFParser.java: | ||
+ | at org.apache.pdfbox.pdfparser.PDFParser.initialParse(PDFParser.java: | ||
+ | at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java: | ||
+ | at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java: | ||
+ | at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java: | ||
+ | at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java: | ||
+ | at parser.pdfbox.core.PdfStreamEngine.processFile(PdfStreamEngine.java: | ||
+ | at parser.pdfbox.PdfBoxParser.parse(PdfBoxParser.java: | ||
+ | at cli.PdfParserCommandLine.parse(PdfParserCommandLine.java: | ||
+ | at cli.PdfParserCommandLine.processFile(PdfParserCommandLine.java: | ||
+ | at cli.PdfParserCommandLine.process(PdfParserCommandLine.java: | ||
+ | at cli.PdfParserCommandLine.main(PdfParserCommandLine.java: | ||
+ | Caused by: java.lang.ClassNotFoundException: | ||
+ | at java.net.URLClassLoader$1.run(URLClassLoader.java: | ||
+ | at java.net.URLClassLoader$1.run(URLClassLoader.java: | ||
+ | at java.security.AccessController.doPrivileged(Native Method) | ||
+ | at java.net.URLClassLoader.findClass(URLClassLoader.java: | ||
+ | at java.lang.ClassLoader.loadClass(ClassLoader.java: | ||
+ | at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java: | ||
+ | at java.lang.ClassLoader.loadClass(ClassLoader.java: | ||
+ | ... 13 more | ||
+ | |||
+ | --- | ||
+ | |||
+ | Then: | ||
+ | a. Obtain bouncycastle (encryption? | ||
+ | |||
+ | Download both jar files listed under the " | ||
+ | |||
+ | b. Then see https:// | ||
+ | for how to run a java programme when you have multiple jar files on classpath, as you can't run java with both -cp and -jar. | ||
+ | |||
+ | greenstone@bedrock: | ||
+ | --> |
en/user_advanced/ice_cite.txt · Last modified: 2023/03/13 01:46 by 127.0.0.1