psmerge
epsutil
prescript
and pstotext
Postscript is a page description language. It was introduced by Adobe Systems in the mid-eighties and has become the standard device independent file format for printing graphics files. What this means is that PostScript describes a graphics image in such a way so that it does not make any reference to specific device features (e.g. printer resolution) so that the same description (postScript file) could be used on any PostScript compatible printer.
An Encapsulated PostScript File (EPSF or EPS) is a PostScript file structured so that it can be incorporated or included into another PostScript file (so that for example a diagram created with a graphics application can be inserted into a text document created with a word processor).
PDF is another page description language introduced by Adobe to replace PostScript, however it isn’t yet in as widespread use as PostScript. For instance its quite hard to find a printer that has a PDF interpreter implemented in hardware, i.e. you can not sent a PDF file directly to the printer but must first convert it to PostScript using display software such as Adobe Acrobat.
The Ghostscript software suite is an interpreter for the PostScript language, with the ability to convert PostScript language files to many other formats, display them, and print them on printers that don’t have PostScript language capability built in. Additionally Ghostscript also functions as an interpreter for Portable Document Format (PDF) files, with the much the same capabilities. Finally the suite also contains a C subroutine library (the Ghostscript library) that implements the graphics capabilities that appear as primitive operations in the PostScript language.
There are actually two different versions of Ghostscript, these being the Aladdin and GNU distributions. The main difference between them seems to be the licencing terms, GNU Ghostscript being distributed under the GPL of course with the Aladdin version being distributed under the Aladdin Free Public Licence. The only difference in the licencing terms appears to be that the Aladdin licence does not allow commercial distribution. If you are using Linux you almost certainly have GNU Ghostscript installed due to the licencing issue.
Further information on Ghostscript can be found at http://www.cs.wisc.edu/~ghost/.
Ghostview is a full fuction X Windows interface for the Ghostscript the PostScript interpreter.
Ghostview and Ghostscript function as two cooperating programs. Ghostview creates the
viewing window and Ghostscript draws in it. The GUI is fairly self explanatory, however
the application ships with an extensive manual page (type man ghostview
at the UNIX
prompt).
GV is a version of Ghostview that was modified for VMS, some enhancements made, and then
modified to run again under Unix. It is now replacing Ghostview as the standard desktop tool for
viewing PostScript files, and is in fact the default viewier in most Linux dsitributions (i.e. if you type
ghostview
on a Linux prompt you’ll probably actually start the GV program instead). An example of
GV in action can be seen in Figure 45. Further information on GV and Ghostview can be found at
http://www.cs.wisc.edu/~ghost/.
Adobe Acrobat Reader allows you to view and print PDF files. While the viewer is free, if you
want to create PDF content the tools to do so are not. More information is available at
http://www.adobe.com/products/acrobat/readermain.html. The Acrobat reader is distributed as
part of the Staarlink baseset software, and can be started by typing acroread
.
psmerge
psmerge
is a utility program for merging one or more Encapsulated PostScript Files into a single
PostScript file. The input files can be individually rotated, scaled and shifted. The output file can either
be Encapsulated PostScript or “normal” PostScript suitable for sending to a printer. The psmerge
utility is covered in detail in SUN/164.
epsutil
epsutil
is a utility for manipulating Encapsulated PostScript files. For more information see the
manual at http://www.math.utah.edu/~beebe/software/epsutil/epsutil.html.
prescript
and pstotext
prescript
extracts text from a PostScript file, storing it either as plain ASCII text, or as HTML
according to the mandatory first command-line argument. Usage is:
prescript [ html ∣ plain ] [ input.ps ]
The output file will be given the same base name as the input file, with its file extension set to one of
.html
or .txt
, according to the first command-line argument.
prescript
uses a PostScript interpreter, normally gs
, to execute the PostScript program, so that even
text that is generated programmatically, rather than being explicitly present in PostScript strings, can
be extracted. Particular attention is paid to heuristic recognition of word breaks, to reconstruction of
words hyphenated at line breaks, to preservation of paragraph breaks, and to recognition of
TEXligatures.
The prescript
program can be downloaded from http://www.nzdl.org/html/prescript.html.
A possible substitute for prescript
is the pstotext
utility. More information can be found at
http://www.research.digital.com/SRC/virtualpaper/pstotext.html.
PDF files can be easily generated using the gs
utility using the following command.
gs -q -dSAFER -dNOPAUSE -sPAPERSIZE=a4 -sDEVICE=pdfwrite
-sOutputFile=output.pdf input.ps
PSUtils, written by Angus Duggan, is a collection of useful utilities for manipulating PostScript
documents. Programs included are psnup
, for placing out several logical pages on a single sheet of
paper, psselect
, for selecting pages from a document, pstops
, for general imposition,
psbook
, for signature generation for booklet printing, and psresize
, for adjusting page
sizes.
psbook
The psbook
program rearranges pages from a PostScript document into “signatures” for
printing books or booklets, creating a new PostScript file.
Usage is:
psbook [ -q ] [ -ssignature ] [ infile [ outfile ] ]
Where -q
surpresses printing of page numbers below the pages being rearranged (by default
page numbers are printed), and -s
signature selects the size of signature which will be used. The
signature size is the number of sides which will be folded and bound together; the number
given should be a multiple of four. The default is to use one signature for the whole
file. Extra blank sides will be added if the file does not contain a multiple of four
pages.
psnup
The psnup
program puts multiple logical pages onto each physical sheet of paper. The potential
use of this utility is varied but one particular use is in conjunction with psbook
. For example,
using groff
to create a PostScript document and lpr
as the UNIX print spooler a typical
command line might look like this:
Where file is a four-page document this command will result in a two-page document printing two pages of file per page and rearranges the page order to match the input Pages 4 and 1 on the first output page and Pages 2 then 3 of the input document on the second output page.
Usage is:
psnup [ -wwidth ] [ -hheight ] [ -ppaper ] [ -Wwidth ] [ -Hheight ]
[ -Ppaper ] [ -l ] [ -r ] [ -f ] [ -c ] [ -mmargin ]
[ -bborder ] [ -dlwidth ] [ -sscale ] [ -+nup ] [ -q ]
[ infile [ outfile ] ]
The -w
option gives the paper width, and the -h
option gives the paper height, normally
specified in ‘cm’ or ‘in’ to convert PostScript’s points (1/72 of an inch) to centimeters or inches.
The -p
option can be used as an alternative, to set the paper size to A3, A4, A5, B5, letter,
legal, tabloid, statement, executive, folio, quarto or 10x14. The default paper size is
A4.
The -W
, -H
, and -P
options set the input paper size, if it is different from the output size. This
makes it easy to impose pages of one size on a different size of paper.
The -l
option should be used for pages which are in landscape orientation (rotated 90 degrees
anticlockwise). The -r
option should be used for pages which are in seascape orientation
(rotated 90 degrees clockwise), and the -f
option should be used for pages which have the width
and height interchanged, but are not rotated.
Psnup normally uses “row-major” layout, where adjacent pages are placed in rows across the
paper. The -c
option changes the order to “column-major”, where successive pages are placed in
columns down the paper.
A margin to leave around the whole page can be specified with the -m
option. This is useful for
sheets of “thumbnail” pages, because the normal page margins are reduced by putting multiple
pages on a single sheet.
The -b
option is used to specify an additional margin around each page on a sheet.
The -d
option draws a line around the border of each page, of the specified width. If the lwidth
parameter is omitted, a default linewidth of 1 point is assumed. The linewidth is
relative to the original page dimensions, i.e. it is scaled down with the rest of the
page.
The scale chosen by psnup can be overridden with the -s
option. This is useful to merge pages
which are already reduced.
The -
nup option selects the number of logical pages to put on each sheet of paper. This can be
any whole number; psnup tries to optimise the layout so that the minimum amount of space is
wasted. If psnup cannot find a layout within its tolerance limit, it will abort with an error
message. The alternative form -n
nup can also be used, for compatibility with other n-up
programs. psnup
normally prints the page numbers of the pages re-arranged; the -q
option
suppresses this feature.
psselect
The psselect
program selects pages from a PostScript document, creating a new PostScript file.
Usage is:
psselect [ -q ] [ -e ] [ -o ] [ -r ] [ -ppages ] [ pages ]
[ infile [ outfile ] ]
Where the -e
option selects all of the even pages; it may be used in conjunction with the other
page selection options to select the even pages from a range of pages, alternatively the -o
option
selects all of the odd pages; it also may be used in conjunction with the other page selection
options.
The -p
pages option specifies the pages which are to be selected. Pages is a comma-separated list
of page ranges, each of which may be a page number, or a page range of the form first-last. If
first is omitted, the first page is assumed, and if last is omitted, the last page is assumed. The
prefix character “_” indicates that the page number is relative to the end of the document,
counting backwards. If just this character with no page number is used, a blank page will be
inserted in the output.
The -r
option causes psselect to output the selected pages in reverse order.
psselect
normally prints the page numbers of the pages rearranged; the -q
option suppresses
this. If any of the -r
, -e
, or -o
options are specified, the page range must be given with the -p
option.
pstops
The pstops
program preforms general page rearrangement and selection, creating a new
PostScript file. pstops
can be used to perform a large number of arbitrary re-arrangements of
documents, including arranging for printing 2-up, 4-up, booklets, reversing, selecting front or
back sides of documents, scaling, etc.
Usage is:
pstops [ -q ] [ -b ] [ -wwidth ] [ -hheight ] [ -ppaper ] [ -dlwidth ]
pagespecs [ infile [ outfile ] ]
where pagespecs follow the syntax:
modulo is the number of pages in each block. The value of modulo should be greater than 0; the
default value is 1. specs are the page specifications for the pages in each block. The value of the
pageno in each spec should be between 0 (for the first page in the block) and modulo-1 (for the
last page in each block) inclusive. The optional dimensions xoff and yoff shift the
page by the specified amount. xoff and yoff are in PostScript’s points, but may be
followed by the units ‘cm
’ or ‘in
’ to convert to centimetres or inches, or the flags ‘w
’ or ‘h
’
to specify as a multiple of the width or height. The optional flags L
, R
, and U
rotate
the page left, right, or upside-down. The optional scale parameter scales the page
by the fraction specified. If the optional minus sign is specified, the page is relative
to the end of the document, instead of the start. If page specs are separated by ‘+
’
the pages will be merged into one page; if they are separated by ‘,
’ they will be on
separate pages. If there is only one page specification, with pageno zero, it may be
omitted.
The shift, rotation, and scaling are performed in that order regardless of which order they appear on the command line.
The -w
option gives the width which is used by the ‘w
’ dimension specifier, and the -h
option
gives the height which is used by the ‘h
’ dimension specifier. These dimensions are also used
(after scaling) to set the clipping path for each page. The -p
option can be used as an alternative,
to set the paper size to A3, A4, A5, B5, letter, legal, tabloid, statement, executive, folio, quarto or
10x14. The default paper size is A4.
The -b
option prevents any bind operators in the PostScript prolog from binding.
This may be needed in cases where complex multi-page re-arrangements are being
done.
The -d
option draws a line around the border of each page, of the specified width. If the lwidth
parameter is omitted, a default linewidth of 1 point is assumed. The linewidth is
relative to the original page dimensions, i.e. it is scaled up or down with the rest of the
page.
pstops
normally prints the page numbers of the pages re-arranged; the -q
option suppresses this
feature.
psresize
The psresize
program rescales and centres a document on a different size of paper. Usage
is:
psresize [ -wwidth ] [ -hheight ] [ -ppaper ] [ -Wwidth ] [ -Hheight ]
[ -Ppaper ] [ -q ] [ infile [ outfile ] ]
The -w
option gives the output paper width, and the -h
option gives the output paper height,
normally specified in ‘cm
’ or ‘in
’ to convert PostScript’s points (1/72 of an inch) to centimeters or
inches. The -p
option can be used as an alternative, to set the output paper size to A3, A4, A5,
B5, letter, legal, tabloid, statement, executive, folio, quarto or 10x14. The default output paper
size is A4. The -W
option gives the input paper width, and the -H
option gives the input paper
height. The -P
option can be used as an alternative, to set the input paper size. psresize
normally prints the page numbers of the pages output; the -q option suppresses this
feature.
A common task is to take an image, for instance a GIF or JPEG, and generate a PS or EPS output figure
for publication. Depending on which package is used for this task there is a suprising difference
between the size of the final postscript image. Of the packages available ImageMagick seems to
produce the smallest postscript output files due to its use of vectorised postscript rather than bitmaps
which other packages (such as xv
) use. In extreme cases this can mean the difference between a 2Mb
and 50k final postscript file. All the postscript images in this cookbook were generated from the
original GIF files using ImageMagick.