RTF Output Shows "{ Apple0}" Instead of Bold "Apple" in LibreOffice on Ubuntu

Question

I'm writing a Python script to generate an RTF file containing text excerpts with a search term highlighted in bold. The script successfully generates the excerpts with the keyword, but the term isn’t bolded as intended.

Search Term: "Apple"

Expected Output in LibreOffice: Apple (bolded)

Actual Output in LibreOffice: "Apple0" (plain text with "0" appended)

Raw RTF Text: { Apple0} (viewed in a text editor)

I expect { Apple0} to be {\b Apple\b0}, where \b starts bold and \b0 ends it, per RTF syntax.

Here’s a simplified version of my Python code:

import re

TERM = "Apple"
RTF_HEADER = r"{
tf1\ansi\ansicpg1252\deff0
ouicompat\deflang1033{\fonttbl{\f0\fswiss\fcharset0 Calibri;}}\f0\fs22\par"
RTF_FOOTER = r"}"
BOLD_START = r"{\b "
BOLD_END = r"\b0}"

excerpt = "This is an Apple test."
term_pattern = re.compile(rf"\b{TERM}\b", re.IGNORECASE)
bolded_term = BOLD_START + TERM + BOLD_END  # Intended: {\b Apple\b0}
excerpt_bolded = term_pattern.sub(bolded_term, excerpt)

with open("output.rtf", "w", encoding="utf-8") as f:
    f.write(RTF_HEADER + excerpt_bolded + RTF_FOOTER)

Robert · Accepted Answer

Backslashes have special meaning in regular expressions. You use them with \b for word boundary in re.compile, but the rtf pieces you put together also include backslashes for the rtf commands. You need to escape those with another backslash each, so that they don't have special meaning in the regular expression.

  BOLD_START = r"{\b "
  BOLD_END = r"\b0}"

RTF Output Shows "{ Apple0}" Instead of Bold "Apple" in LibreOffice on Ubuntu

Tags:

python

libreoffice

rtf

robbin olsson

1 Answers

Robert

Recent Activity

Donate For Us

RTF Output Shows "{ Apple0}" Instead of Bold "Apple" in LibreOffice on Ubuntu

Tags:

python

libreoffice

rtf

robbin olsson

1 Answers

Robert

Related questions

Recent Activity

Donate For Us