Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ASP.Net Converting and Merging documents into single PDF

I need to have the ability to convert and merge various documents into a single Pdf.

The documents could be of varying types, such as Word, Open Office, Images, Text, Web pages (by URL) and the PDF would usually consist of 2-3 documents.

At the moment, we are using BCL Technologies easyPDF with Microsoft Office installed onto the Server. This handles most documents but we haven't had it doing Open Office ones yet.

We currently produce around 100-1000 of these PDF's per day.

The reason I am asking the question is that performance is a key issue. The PDF is generated for users on the fly and so the waiting times we are currently getting of 30-60 seconds is becoming unacceptable.

We have done some caching around documents when they are intially uploaded so the main tasks that happens when a User requests a Pdf is merging a number of already generated Pdf's.

Does anyone else have any other tools they have used that work reliably for most common document types and above all, quickly? When put like that, it seems like I'm asking a lot!

Edit: Thanks for all the great advice, I'll look into some of these and compare performance.

Just to add to all this, money is not really an object. We're more than happy to pay for different applications to perform each task as well as looking into various hardware options to distribute the load as much as possible.

like image 704
Robin Day Avatar asked Dec 13 '25 19:12

Robin Day


2 Answers

Merging multiple PDF documents is normally simple enough (as long as they don't need to be merged on the same page) - you could compare your merge performance with something like iTextSharp (.NET version of iText) to be sure it isn't a bottleneck - otherwise the conversion from other formats to PDF is likely the bottleneck.

In almost all cases, the method used to convert X to PDF is to execute the applications print command, targeted at a software PDF printer, to create a temporary PDF file.

This means:

  • The target application (for example Office) is opened and closed
  • The document has to travel through the printing service

In your situation, are you converting arbitrary documents submitted by the users, or do the documents come from a stored library of files? If it's a library, you could make a PDF copy of each file as it is added to the library (instead of when the user makes a request), and then only merge the PDF files.

like image 157
David Avatar answered Dec 16 '25 10:12

David


We use ABC Pdf. I don't know if it will be fast enough for your needs, but it seems to work for our use.

like image 30
kemiller2002 Avatar answered Dec 16 '25 11:12

kemiller2002



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!