Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PyPDF2 paper size manipulation

Tags:

python

pypdf

I am using PyPDF2 to take an input PDF of any paper size and convert it to a PDF of A4 size with the input PDF scaled and fit in the centre of the output pdf.

Here's an example of an input (convert to pdf with imagemagick convert image.png input.pdf), which can be of any dimensions: input

And the expected output is: output

I'm not a developer and my knowledge of python is basic but I have been trying to figure this out from the documentation, but haven't had much success.

My latest attempt is as follows:

from pypdf import PdfReader, PdfWriter, Transformation, PageObject
from pypdf import PaperSize

pdf_reader = PdfReader("input.pdf")
page = pdf_reader.pages[0]
writer = PdfWriter()

A4_w = PaperSize.A4.width
A4_h = PaperSize.A4.height


# resize page2 to fit *inside* A4
h = float(page.mediabox.height)
w = float(page.mediabox.width)
print(A4_h, h, A4_w, w)
scale_factor = min(A4_h / h, A4_w / w)
print(scale_factor)

transform = Transformation().scale(scale_factor, scale_factor).translate(0, A4_h / 3)
print(transform.ctm)

# page.scale_by(scale_factor)
page.add_transformation(transform)

# merge the pages to fit inside A4

# prepare A4 blank page
page_A4 = PageObject.create_blank_page(width=A4_w, height=A4_h)
page_A4.merge_page(page)
print(page_A4.mediabox)

writer.add_page(page_A4)
writer.write("output.pdf")

Which gives this output:

enter image description here

I could be completely off track with my approach and it may be the inefficient way of doing it.

I was hoping I would have a simple function in the package where I can define the output paper size and the scaling factor, similar to this.

like image 298
Zain Khaishagi Avatar asked Oct 29 '25 01:10

Zain Khaishagi


1 Answers

You almost got it!

The transformations are applied only to the content, but not to the boxes (mediabox/trimbox/cropbox/artbox/bleedbox).

You need to adjust the cropbox:

from pypdf.generic import RectangleObject
page.cropbox = RectangleObject((0, 0, A4_w, A4_h))

Full script

from pypdf import PdfReader, PdfWriter, Transformation, PageObject, PaperSize
from pypdf.generic import RectangleObject

reader = PdfReader("input.pdf")
page = reader.pages[0]
writer = PdfWriter()

A4_w = PaperSize.A4.width
A4_h = PaperSize.A4.height

# resize page to fit *inside* A4
h = float(page.mediabox.height)
w = float(page.mediabox.width)
scale_factor = min(A4_h/h, A4_w/w)

transform = Transformation().scale(scale_factor,scale_factor).translate(0, A4_h/3)
page.add_transformation(transform)

page.cropbox = RectangleObject((0, 0, A4_w, A4_h))

# merge the pages to fit inside A4

# prepare A4 blank page
page_A4 = PageObject.create_blank_page(width = A4_w, height = A4_h)
page.mediabox = page_A4.mediabox
page_A4.merge_page(page)

writer.add_page(page_A4)
writer.write('output.pdf')
like image 179
Martin Thoma Avatar answered Oct 30 '25 18:10

Martin Thoma



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!