Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to get text from a downloadable .doc file without saving?

I'm trying to download a .doc file using requests.get() request (though I've heard about other methods - they all require saving too)

Is there any method I could use to extract the text from it (or even convert it into a .txt for example) straight away without saving it into a file?

I've tried passing request.raw into various conventors (docx2txt.process() for example) but I assume they all work with files, not with streams.

like image 504
schoolboychik Avatar asked Dec 07 '25 02:12

schoolboychik


1 Answers

While the script is running the memory allocation are handled by the python interpreter but if you save the content to a file the memory allocated is different. This article can be helpful to you.

Link: article

like image 84
Franz Gastring Avatar answered Dec 08 '25 16:12

Franz Gastring