User Tools

Site Tools


nzdl:prescript

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revisionBoth sides next revision
nzdl:prescript [2017/09/22 01:38] – created kjdonnzdl:prescript [2017/09/25 00:53] kjdon
Line 4: Line 4:
  
 **PostScript conversion to plain ASCII or HTML.**\\ **PostScript conversion to plain ASCII or HTML.**\\
-PreScript is really a PostScript to plain text converter, but rudimentary HTML can also be produced. Tags are inserted to mark paragraphs (<p>), short lines (<br>), page breaks (<hr>), and header and footers (italicized with <i>...</i>). +PreScript is really a PostScript to plain text converter, but rudimentary HTML can also be produced. Tags are inserted to mark paragraphs (<p>), short lines (<br>), page breaks (<hr>), and header and footers (italicized with <i>...</i>). \\
 **Paragraph boundaries detection.**\\ **Paragraph boundaries detection.**\\
-PreScript determines the line spacing of a document and uses this (and also indentations) to determine paragraph boundaries. +PreScript determines the line spacing of a document and uses this (and also indentations) to determine paragraph boundaries. \\
 **Hyphenation removal.**\\ **Hyphenation removal.**\\
-Hyphenated words are de-hyphenated. +Hyphenated words are de-hyphenated. \\
 **Ligature translation.**\\ **Ligature translation.**\\
-Most ligatures used by TeX document are detected. PreScript doesn't track font changes making it impossible to reliably detect all ligatures. +Most ligatures used by TeX document are detected. PreScript doesn't track font changes making it impossible to reliably detect all ligatures. \\
  
  
-Installing PreScript +=====Installing PreScript===== 
-PreScript is written in PostScript and Python. You will need Ghostscript (at least version 4.01) and the Python interpreter (at least version 1.4.). +PreScript is written in PostScript and Python. You will need [[http://pages.cs.wisc.edu/~ghost/|Ghostscript]] (at least version 4.01) and the [[https://www.python.org/|Python]] interpreter (at least version 1.4.). 
-The PreScript 0.1 distribution+====The PreScript 0.1 distribution====
 This distribution is the most stable - it is what you should use to do real work. This distribution is the most stable - it is what you should use to do real work.
  
-    Download the PreScript 0.1 distribution. +  *  Download the PreScript 0.1 distribution. 
-    Define the environment variable PRESCRIPT_DIR to the directory where PreScript is installed (or where ever you put prescript.ps). +  *  Define the environment variable PRESCRIPT_DIR to the directory where PreScript is installed (or where ever you put prescript.ps). 
-    Move prescript.py to a directory listed in your PATH environment variable. You may want to remove the .py suffix (prescript.py can be either a standalone program, or an imported library of another Python program). +  *  Move prescript.py to a directory listed in your PATH environment variable. You may want to remove the .py suffix (prescript.py can be either a standalone program, or an imported library of another Python program). 
-    Change #! /usr/local/bin/python in prescript.py to the location of your Python interpreter. +  *  Change #! /usr/local/bin/python in prescript.py to the location of your Python interpreter. 
  
-The PreScript 2 distribution +====The PreScript 2 distribution==== 
-This is a beta release of our latest version. This version is a lot cleaner and faster; it is also extensible (users can write their own renderers), better documented, and contains better prediction of line, paragraph, and page breaks. If you notice any bugs, want to request new features, or want to become a beta tester please email the New Zealand Digital Library administrator.+This is a beta release of our latest version. This version is a lot cleaner and faster; it is also extensible (users can write their own renderers), better documented, and contains better prediction of line, paragraph, and page breaks. 
  
-    Download a PreScript 2 distribution (the later versions are more stable).+  * Download a PreScript 2 distribution (the later versions are more stable).
  
-        PreScript 2.0 +    * [[http://www.nzdl.org/download/prescript/prescript-2.0.tar.gz|PreScript 2.0]] 
-        PreScript 2.1 +    * [[http://www.nzdl.org/download/prescript/prescript-2.1.tar.gz|PreScript 2.1]] 
-        PreScript 2.2 -- same as Prescript 2.1 but compatibility issues with python 1.5 have been fixed +    * [[http://www.nzdl.org/download/prescript/prescript-2.2.tar.gz|PreScript 2.2]] -- same as Prescript 2.1 but compatibility issues with python 1.5 have been fixed 
  
-    On unix systems 'make install' will install prescript to /usr/local/bin. It will also install the accompanying manual page (to install somewhere else simply edit the Makefile). +  * On unix systems 'make install' will install prescript to /usr/local/bin. It will also install the accompanying manual page (to install somewhere else simply edit the Makefile). 
-    If not installing with the make utility: +  If not installing with the make utility: 
-    It is easiest if all of the program scripts are kept in the same directory, which ideally should be listed in the PATH environment variable. If this is inconvenient, be sure that PRESCRIPT_DIR points to where prescript.ps is installed, and that PYTHONPATH points to where *.py are installed. +  It is easiest if all of the program scripts are kept in the same directory, which ideally should be listed in the PATH environment variable. If this is inconvenient, be sure that PRESCRIPT_DIR points to where prescript.ps is installed, and that PYTHONPATH points to where *.py are installed. 
-    Change #!/usr/local/bin/python in prescript to the location of your Python interpreter ('make install' does NOT do this for you). +  Change #!/usr/local/bin/python in prescript to the location of your Python interpreter ('make install' does NOT do this for you). 
  
 +=====Running PreScript=====
  
-Running PreScript +<code>prescript format input [output] </code>
-Usage:+
  
-    prescript format input [output+  *  format is either plain or html. 
 +  *  input is the input filename, a PostScript file. 
 +  *  output is the output filename. By default, the output file name is the same as the input filename with the path removed and suffix replace to either .txt or .html. 
  
-    format is either plain or html. 
-    input is the input filename, a PostScript file. 
-    output is the output filename. By default, the output file name is the same as the input filename with the path removed and suffix replace to either .txt or .html.  
  
  
-Bugs +=====Notes===== 
-Please report bugs to the New Zealand Digital Library administrator.+PreScript is a port of a Perl program used by the New Zealand Digital Library project to convert computer science technical reports to HTML. The Perl version is deemed unfit for a public release because the code is quite messy (a consequence of Perl's cumbersome syntax for defining objects). The Python version is considerably easier to understand, maintain, and extend. The technical paper [[http://www.nzdl.org/download/prescript/prescript.ps.gz|prescript.ps.gz]] documents the algorithms and heuristics used in PreScript 0.1 - there is an update to this for PreScript 2 inside its distribution archive.
  
  
-Notes +=====Other Postscript Converters=====
-PreScript is a port of a Perl program used by the New Zealand Digital Library project to convert computer science technical reports to HTML. The Perl version is deemed unfit for a public release because the code is quite messy (a consequence of Perl's cumbersome syntax for defining objects). The Python version is considerably easier to understand, maintain, and extend. The technical paper prescript.ps.gz documents the algorithms and heuristics used in PreScript 0.1 - there is an update to this for PreScript 2 inside its distribution archive. +
- +
- +
-Other Postscript Converters+
 Here is a summary of other PostScript to text converters we found. Here is a summary of other PostScript to text converters we found.
  
-pstotext +**[[http://www.research.digital.com/SRC/virtualpaper/pstotext.html|pstotext]]**\\ 
-    From the DEC Virtual Paper research project. PostScript program and C program. Probably the best PostScript to text converter (after PreScript, of course).  +From the DEC Virtual Paper research project. PostScript program and C program. Probably the best PostScript to text converter (after PreScript, of course). \\ 
-ps2html, The Sequel +**[[http://stasi.bradley.edu/ftp/pub/ps2html/ps2html-v2.html|ps2html, The Sequel]]**\\ 
-    Developed at Johns Hopkins University to convert JHU journal articles to HTML. This converter attempts to preserve the formatting of the original PostScript document, but is tied to PostScript files generated with a specific package (QuarkXPress?). A table describing a number of parameters is used to aid conversion and can be modified for new formats. Uses a variation of Ghostscript's ps2ascii.ps.  +Developed at Johns Hopkins University to convert JHU journal articles to HTML. This converter attempts to preserve the formatting of the original PostScript document, but is tied to PostScript files generated with a specific package (QuarkXPress?). A table describing a number of parameters is used to aid conversion and can be modified for new formats. Uses a variation of Ghostscript's ps2ascii.ps. \\ 
-ps2ascii.ps +**ps2ascii.ps**\\ 
-    Part of the Ghostscript distribution. ps2ascii.ps is considerably less robust than PreScript.  +Part of the Ghostscript distribution. ps2ascii.ps is considerably less robust than PreScript. \\ 
-ps2a.sh +**[[ftp://ftp.mpce.mq.edu.au/pub/comp/src/ps2a.sh|ps2a.sh]]**\\ 
-    A PostScript program similar to Ghostscript's ps2ascii.ps.  +A PostScript program similar to Ghostscript's ps2ascii.ps. \\ 
-ps2ascii.shar +**[[ftp://apocalypse.engr.ucf.edu/usr/ssd/ps2ascii.shar|ps2ascii.shar]]**\\ 
-    A PostScript program and Perl script.  +A PostScript program and Perl script. \\ 
-ps2ascii.pl +**[[ftp://wilma.cs.brown.edu/pub/postscript/ps2ascii.pl|ps2ascii.pl]]**\\ 
-    A Perl script that extracts parenthesized text from a PostScript file.  +A Perl script that extracts parenthesized text from a PostScript file. \\ 
-ps2txt +**[[ftp://ftp.funet.fi/pub/archive/alt.sources/volume92/Feb/920223.01.gz|ps2txt]]**\\ 
-    A standalone C program that extracts parenthesized text. Some special code to deal with dvips generated files. +A standalone C program that extracts parenthesized text. Some special code to deal with dvips generated files. \\
nzdl/prescript.txt · Last modified: 2023/03/13 01:46 by 127.0.0.1