Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to convert RTF to Docx in Python

I'm using doxygen to generate rtf output, but I need the rtf converted to docx in an automated way using python to run on a build system.

  • Input: example.rtf

  • Output: example.docx

I don't want to change any of the styling, formatting, or content. Just do a direct conversion. The same way it would be done manually by opening the .rtf in word and then doing SaveAs .docx

like image 238
ls6777 Avatar asked Oct 17 '25 13:10

ls6777


1 Answers

I spent a lot of time and energy trying to figure this out, so I thought I would post the question and solution for the community. It was actually very simple in the end, but took a long time to find the correct information to accomplish this.

This solution requires the following:

  • Python 3.x
  • PyWin32 module
  • Windows 10 environment (haven't tried it with other flavors of Windows)
#convert rtf to docx and embed all pictures in the final document
def ConvertRtfToDocx(rootDir, file):
    word = win32com.client.Dispatch("Word.Application")
    wdFormatDocumentDefault = 16
    wdHeaderFooterPrimary = 1
    doc = word.Documents.Open(rootDir + "\\" + file)
    for pic in doc.InlineShapes:
        pic.LinkFormat.SavePictureWithDocument = True
    for hPic in doc.sections(1).headers(wdHeaderFooterPrimary).Range.InlineShapes:
        hPic.LinkFormat.SavePictureWithDocument = True
    doc.SaveAs(str(rootDir + "\\refman.docx"), FileFormat=wdFormatDocumentDefault)
    doc.Close()
    word.Quit()

As rtf can't have embedded images, this also takes any images from RTF and embeds them into the resulting word docx, so there are no external image reference dependencies.

like image 163
ls6777 Avatar answered Oct 21 '25 04:10

ls6777



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!