I am trying to take the text from word file and highlighted the required text and aging want to save the text into new word file.
I am able to highlight the text using ANSI escape sequences but I am unable add it back to the word file.
from docx import Document
doc = Document('t.docx')
##string present in t.docx '''gnjdkgdf helloworld dnvjk dsfgdzfh jsdfKSf klasdfdf sdfvgzjcv'''
if 'helloworld' in doc.paragraphs[0].text:    
    high=doc.paragraphs[0].text.replace('helloworld', '\033[43m{}\033[m'.format('helloworld'))
doc.add_paragraph(high)
doc.save('t1.docx')
getting this error.
ValueError: All strings must be XML compatible: Unicode or ASCII, no NULL bytes or control characters
Instead of using ANSI escape sequences, you could use python-docx's built-in Font highlight color:

from docx import Document
from docx.enum.text import WD_COLOR_INDEX
doc = Document('t.docx')
##string present in t.docx '''gnjdkgdf helloworld dnvjk dsfgdzfh jsdfKSf klasdfdf sdfvgzjcv'''
# Get the first paragraph's text
p1_text = doc.paragraphs[0].text
# Create a new paragraph with "helloworld" highlighted
p2 = doc.add_paragraph()
substrings = p1_text.split('helloworld')
for substring in substrings[:-1]:
    p2.add_run(substring)
    font = p2.add_run('helloworld').font
    font.highlight_color = WD_COLOR_INDEX.YELLOW
p2.add_run(substrings[-1])
# Save document under new name
doc.save('t1.docx')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With