Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

RTF Output Shows "{ Apple0}" Instead of Bold "Apple" in LibreOffice on Ubuntu

I'm writing a Python script to generate an RTF file containing text excerpts with a search term highlighted in bold. The script successfully generates the excerpts with the keyword, but the term isn’t bolded as intended.

Search Term: "Apple"

Expected Output in LibreOffice: Apple (bolded)

Actual Output in LibreOffice: "Apple0" (plain text with "0" appended)

Raw RTF Text: { Apple0} (viewed in a text editor)

I expect { Apple0} to be {\b Apple\b0}, where \b starts bold and \b0 ends it, per RTF syntax.

Here’s a simplified version of my Python code:

import re

TERM = "Apple"
RTF_HEADER = r"{\rtf1\ansi\ansicpg1252\deff0\nouicompat\deflang1033{\fonttbl{\f0\fswiss\fcharset0 Calibri;}}\f0\fs22\par"
RTF_FOOTER = r"}"
BOLD_START = r"{\b "
BOLD_END = r"\b0}"

excerpt = "This is an Apple test."
term_pattern = re.compile(rf"\b{TERM}\b", re.IGNORECASE)
bolded_term = BOLD_START + TERM + BOLD_END  # Intended: {\b Apple\b0}
excerpt_bolded = term_pattern.sub(bolded_term, excerpt)

with open("output.rtf", "w", encoding="utf-8") as f:
    f.write(RTF_HEADER + excerpt_bolded + RTF_FOOTER)
like image 461
robbin olsson Avatar asked Oct 17 '25 13:10

robbin olsson


1 Answers

Backslashes have special meaning in regular expressions. You use them with \b for word boundary in re.compile, but the rtf pieces you put together also include backslashes for the rtf commands. You need to escape those with another backslash each, so that they don't have special meaning in the regular expression.

  BOLD_START = r"{\\b "
  BOLD_END = r"\\b0}"
like image 58
Robert Avatar answered Oct 19 '25 02:10

Robert



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!