Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PDFLib in PHP hogging resources and not flushing to file

Tags:

php

pdf

pdflib

I just inherited a PHP project that generates large PDF files and usually chokes after a few thousand pages and several gigs of server memory. The project was using PDFLib to generate these files 'in memory'.

I was tasked with fixing this, so the first thing I did was to send PDFLib output to a file instead of building in memory. The problem is, it still seems to be building PDFs memory. And much of the memory never seems to be returned to the OS. Eventually, the whole things chokes and dies.

When I task the program with building only snippets of the large PDFs, it seems that the data is not fully flushed to the file on end_document(). I get no errors, yet the PDF is not readable and opening it in a hex editor makes it obvious that the stream is incomplete.

I'm hoping that someone has experienced similar difficulties.

like image 604
Jonathan Hawkes Avatar asked Sep 04 '25 01:09

Jonathan Hawkes


2 Answers

Solved! Needed to call PDF_delete_textflow() on each textflow, as they are given document scope and don't go away until the document is closed, which was never since all available memory was exhausted before that point.

like image 55
Jonathan Hawkes Avatar answered Sep 07 '25 03:09

Jonathan Hawkes


You have to make sure that you are closing each page as well as closing the document. This would be done by calling the "end_page_ext" at the end of every written page.

Additionally if you are importing pages from another PDF you have to call "close_pdi_page" after each improted page and "close_pdi_document" when you're done with each imported document.

like image 38
Lee Hesselden Avatar answered Sep 07 '25 01:09

Lee Hesselden