I'm using doxygen to generate rtf output, but I need the rtf converted to docx in an automated way using python to run on a build system.
Input: example.rtf
Output: example.docx
I don't want to change any of the styling, formatting, or content. Just do a direct conversion. The same way it would be done manually by opening the .rtf in word and then doing SaveAs .docx
I spent a lot of time and energy trying to figure this out, so I thought I would post the question and solution for the community. It was actually very simple in the end, but took a long time to find the correct information to accomplish this.
This solution requires the following:
#convert rtf to docx and embed all pictures in the final document
def ConvertRtfToDocx(rootDir, file):
word = win32com.client.Dispatch("Word.Application")
wdFormatDocumentDefault = 16
wdHeaderFooterPrimary = 1
doc = word.Documents.Open(rootDir + "\\" + file)
for pic in doc.InlineShapes:
pic.LinkFormat.SavePictureWithDocument = True
for hPic in doc.sections(1).headers(wdHeaderFooterPrimary).Range.InlineShapes:
hPic.LinkFormat.SavePictureWithDocument = True
doc.SaveAs(str(rootDir + "\\refman.docx"), FileFormat=wdFormatDocumentDefault)
doc.Close()
word.Quit()
As rtf can't have embedded images, this also takes any images from RTF and embeds them into the resulting word docx, so there are no external image reference dependencies.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With