Delete text from pdf using PyMUPDF

Question

I need to remove the text "DRAFT" from a pdf document using Python. I can find the text box containing the text but can't find an example of how to edit the pdf text element using pymupdf.

In the example below the draft object contains the coords and text for the DRAFT text element.

import fitz

fname = r"original.pdf"
doc = fitz.open(fname)
page = doc.load_page(0)

draft = page.search_for("DRAFT")

# insert code here to delete the DRAFT text or replace it with an empty string

out_fname = r"final.pdf"
doc.save(out_fname)

Added 4/28/2022 I found a way to delete the text but unfortunately it also deletes any overlapping text underneath the box around DRAFT. I really just want to delete the DRAFT letters without modifying underlying layers

# insert code here to delete the DRAFT text or replace it with an empty string
rl = page.search_for("DRAFT", quads = True)
page.add_redact_annot(rl[0])

page.apply_redactions()

xiaoxu · Accepted Answer

You can try this.

import fitz

doc = fitz.open("xxxx")

for page in doc:
    for xref in page.get_contents():
        stream = doc.xref_stream(xref).replace(b'The string to delete', b'')
        doc.update_stream(xref, stream)

Delete text from pdf using PyMUPDF

Tags:

python

pymupdf

user3005422

1 Answers

xiaoxu

Recent Activity

Donate For Us

Delete text from pdf using PyMUPDF

Tags:

python

pymupdf

user3005422

1 Answers

xiaoxu

Related questions

Recent Activity

Donate For Us