Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Merging PDF's with python pypdf and deleting merged files

I'm trying to write a program in python that takes a PDF file and appends to it first any pdf which includes the name of a fruit to it(Mango, Orange or Apple), then appends the pdf's with the names of animals to the original file(Zebra, Monkey, Dog) and finally appends any remaining PDF's. This is the code I have:

import os
from PyPDF2 import PdfFileReader, PdfFileMerger

originalFile="C:/originalFile.pdf"

merger = PdfFileMerger()
merger.append(PdfFileReader(file(originalFile, 'rb')))
os.remove(originalFile)

for filename in os.listdir('C:/'):
    if "Mango" in filename or "Apple" in filename or "Orange" in filename:
        if ".pdf" in filename:
            merger.append(PdfFileReader(file('C:/'+filename, 'rb')))
            os.remove("C:/"+filename)

for filename in os.listdir('C:/'):
    if "Zebra" in filename or "Monkey" in filename or "Dog" in filename:
        if ".pdf" in filename:
            merger.append(PdfFileReader(file('C:/'+filename, 'rb')))
            os.remove("C:/"+filename)

for filename in os.listdir('C:/'):
    if ".pdf" in filename:
        merger.append(PdfFileReader(file('C:/TRIAL/'+filename, 'rb')))
        os.remove("C:/TRIAL/"+filename)

merger.write(originalFile)

When I run this program I get the following Error:

os.remove(originalFile) WindowsError: [Error 32] The process cannot access the file because it is being used by another process: 'C:/originalFile.pdf'

Could anyone explain me how to close the file after I've added it to my merger file?

like image 497
user2617248 Avatar asked Dec 04 '25 10:12

user2617248


2 Answers

You should close the file explicitly.

fd = file('C:/'+filename, 'rb')
merger.append(PdfFileReader(fd))
fd.close()
os.remove('C:/'+filename)

A safer version:

fd = None
try:
    fd = file('C:/'+filename, 'rb')
    merger.append(PdfFileReader(fd))
finally:
    if fd: fd.close()
if os.path.exists('C:/'+filename): os.remove('C:/'+filename)

Which can be simplified in Python 2.5+ as:

with file('C:/'+filename, 'rb') as fd:
    merger.append(PdfFileReader(fd))
if os.path.exists('C:/'+filename): os.remove('C:/'+filename)

Which will cause python to close the file automagically.

like image 54
SpliFF Avatar answered Dec 06 '25 23:12

SpliFF


To close a file, you should have opened it with with statement, which always closes the file whatever happens to the code inside the with block:

with open(originalFile,'rb') as pdf:
    merger.append(PdfFileReader(pdf))
os.remove(originalFile)

This works for me.

Just a reminder that, you can close the file since you have added the pdf into the merger. Note that if you just open it with PdfFileReader(pdf) and haven't done anything to it, you can't delete the file or the PdfFileReader object won't be able to read the file. This is because the PdfFileReader only actually reads the file if you call some read method on it like getPage

like image 23
justhalf Avatar answered Dec 07 '25 01:12

justhalf



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!