I have a bunch of PDF documents and all of them contain a title page that I want to remove.
Is there a way to programmatically remove them?
Most of the PDF utilities I found can only combine documents but not remove pages. In the print dialog I can choose page 2 to and then print to a file, but I can't find any way to access this function programmatically.
Open the PDF in Acrobat. Choose the Organize Pages tool from the right pane. The Organize Pages toolset is displayed in the secondary toolbar, and the page thumbnails are displayed in the Document area. Select a page thumbnail you want to delete and click the Delete icon to delete the page.
Delete a page from a PDF: Choose View > Thumbnails or View > Contact Sheet, select the page or pages to delete, then press the Delete key on your keyboard (or choose Edit > Delete). When you delete a page from a PDF, all the annotations on the page are removed as well.
Use pdftk.
To remove page 8:
pdftk in.pdf cat 1-7 9-end output out.pdf
Just for the record: you can also use Ghostscript:
gs \
  -o removed-page-1-from-input.pdf \
  -sDEVICE=pdfwrite \
  -dFirstPage=2 \
  /path/to/input.pdf
However, pdftk is the better tool for that job (and was already recommended to you).
Also, this Ghostscript commandline could change some of the properties in your input.pdf because it essentially re-distills it. This could be a desired change or not. To control individual aspects of this behavior (or to suppress some of them), a more complicated commandline with more parameters is required.
pdftk will re-use the original PDF objects for each page as-is.
Ghostscript has the additional parameter of -dLastPage too. Together with -dFirstPage this allows for the extraction of page ranges.
The newest versions sport an new parameter, -sPageList. This could be used like this:
-sPageList="1, 5-10, 12-"
to extract pages 1, 5-10 and 12-last from the input document. However, I've not (yet) personally tested this new feature and I'm not sure how reliably it works.
For older versions of Ghostscript (as well as the most recent one), it should work to feed the same input PDF multiple times with different parameters to same GS call to extract non-contiguous page selections from a document. You could even combine pages from different documents this way:
gs \
  -o selected-pages.pdf \
  -sDEVICE=pdfwrite     \
  -dFirstPage=2         \
  -dLastPage=2          \
   in1.pdf              \
                        \
  -dFirstPage=10        \
  -dLastPage=15         \
   in1.pdf              \
                        \
  -dFirstPage=1         \
  -dLastPage=1          \
   in1.pdf              \
                        \
  -dFirstPage=4         \
  -dLastPage=6          \
   in2.pdf
Caveats: Combining pages from different documents which use non-embedded fonts or identical font names but different encodings and/or different subsets (with identical fontname-prefixes) may lead to a faulty PDF in the result.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With