6 Exploding/concatenating files using "pdfcat"
6.1 Program behavior and defaults
The "pdfcat" program is a command line tool that allows you to explode or concatenate PDF files on the server. Using "pdfcat" is indispensable if you have a multi-page document that is to be used in an EPSF-only or in an OPI workflow. The OPI Server e.g. will generate a layout file of the first document page only - so if you want to have a layout of document page 2, you have to extract the desired page and create a new single-page PDF file. The illustration below shows the three different program modes that are all independent of one another and do exclude each other: The "concatenate" mode merges the selected PDF files into a new one, the "append" mode appends the selected files to an existing one, and the "extract" mode writes the selected pages of an existing document into new single-page files.
Every PDF file contains a list of file infos such as creator, creation date, modification date, image profiles (optional). Furthermore, a PDF file may contain security settings (optional), a table of contents (TOC) and annotations such as text fields, buttons, etc.
The "pdfcat" program re-arranges PDF files and creates new ones. The file infos, profiles, security settings, TOC, and annotations are handled as follows:
- If "pdfcat" creates a new multi-page document - as shown in the first row in the above illustration - the new file will have its own creator and creation date. It will not have any profiles, even if the input files were tagged, and it will not have any security settings. TOC and annotations - if there were any in one of the input files - will not be copied to the new file.
- If "pdfcat" appends pages and/or documents to an existing PDF file - as shown in the second row in the above illustration - the output file (here: helios.pdf (new)) will contain all the information that were already included in the original file (here: helios.pdf). File infos, profiles, security settings, TOC, and annotations remain unchanged. Information that had been included in the appended files (here: Doc2 and Doc3) will not make it into the output file; they will be ignored.
- If "pdfcat" explodes a document into several new ones - as shown in the last row in the above illustration - the new single-page documents inherit the file infos, profiles, and security settings from the original file. TOC and annotations, however, will not be copied to the output files. Please note that "pdfcat" cannot write comments into the Finder Info. Usually, if a PDF file contains profile information, the profiles are listed in the Comments text field of the Macintosh Finder Info (compare Fig. 11 in chapter 5 "Before getting started"). This Comments text field may be empty for PDF files you have created with "pdfcat explode", even though the new files contain profile information. In that case, you can use our Acrobat print plug-in or the EtherShare OPI "Tagger" application to make the profiles visible.
Automatic layout generation is not available for PDF files that have been created with "pdfcat". You have to use our "touch" application or the command line layout program on the server to generate layouts from the new PDF files.
The "pdfcat" program recognizes the following parameters:
pdfcat -o outDoc inDocs ...
pdfcat -a outDoc inDocs ...
opens the online help file.
concatenates existing PDF documents (inDocs...) to a new PDF document (outDoc). With this parameter, you have to specify one output file name and one or more input file names. It is possible to copy only selected pages from the input file to the output file by specifying page ranges. Valid ranges are listed in paragraph
Page ranges below.
Important: If the name you choose for the output file already exists in the destination directory, the existing file will be replaced.
appends one or more PDF documents or parts of them (inDocs...) to another - already existing - PDF document (outDoc).
extracts of a given PDF document (inDoc) either all pages, or the pages you select explicitly, and creates new single-page PDF files. With this parameter, you have to specify a prefix for the output files. Page number <nnn> of the input file will then be copied to a new PDF document named "<prefix><nnn>.pdf".
Note: File name specifications may contain a complete UNIX path name that leads to another directory.
Each inDoc specification can be followed by a comma-separated list of range specifications. Thus, you can make sure that only the specified pages of the input documents are used.
Valid range specifications are shown in the example below:
- 2 (uses page 2 of inDoc only)
- -11 (uses pages 1 to 11)
- 11- (uses pages 11 to last page)
- 2-11 (uses pages 2 to 11 - takes page 2 first)
- 11-2 (also uses pages 2 to 11 - but in reverse order)
- \$-11 (uses pages last page to 11. The character "$" stands for the last page of a document; on a shell you have to enter "\$".)
You can specify several - comma-separated - ranges at a time (e.g.
inDoc.pdf,3-6,12-). Do not use blanks within such a specification.
pdfcat -o new.pdf doc1.pdf doc2.pdf,2,5-7
writes all pages of document "doc1.pdf" and the pages 2, 5, 6, 7 of document "doc2.pdf" to a new document called "new.pdf".
pdfcat -o new.pdf doc1.pdf,\$-1
writes all pages of document "doc1.pdf" in reverse order to a new document called "new.pdf".
pdfcat -a tmp.pdf doc1.pdf,9-6
appends the pages 9, 8 , 7, 6 (in this order) of document "doc1.pdf" to an existing document called "tmp.pdf".
pdfcat -e page doc1.pdf,-3
writes the pages 1, 2, 3 of document "doc1.pdf" to new single-page documents called "page1.pdf", "page2.pdf", and "page3.pdf".