I'm trying to open a word document and just extract all the text that is in the document and display it to the user using Win32::OLE
#usr/bin/perl
#OLEWord.pl
#Use string and print warnings
use strict;use warnings;
#Using OLE + OLE constants for Variants and OLE enumeration for Enumerations
use Win32::OLE;
use Win32::OLE::Const 'Microsoft Word';
$Win32::OLE::Warn = 3;
#set the file to be opened
my $file = '/work/Test.docx';
#Create a new instance of Win32::OLE for the Word application, die if could not open the application
my $MSWord = Win32::OLE->new('Word.Application','Quit') and "Opened Word" or die "Unable to open document ", Win32::OLE->LastError();
#Set the screen to Visible, so that you can see what is going on
$MSWord->{'Visible'} = 1;
#open the request file or die and print warning message
my $Doc = $MSWord->Documents->Open('C:\work\Test.docx') or die "Could not open ", $file, " Error:", Win32::OLE->LastError();
#$MSWord->ActiveDocument->SaveAs({Filename => 'AlteredTest.docx',
#FileFormat => wdFormatDocument});
sub ShowObjs {
my $obj = shift;
foreach (sort keys %$obj) {
print "Keys: $_ - $obj->{$_}\n"; }
}
my $paragraphs = $Doc->Paragraphs;
ShowObjs($paragraphs);
# Get and print the Text inside the opened file
my $paragraphs = $Doc->Paragraphs;
my $txt = $Doc->Range->Text;
print $txt;
$MSWord->ActiveDocument->Close;
$MSWord->Quit;
I'm getting this error code:
OLE exception from "Microsoft Word":
Command Failed
Win32::OLE(0.1709) error ox800a1066 in METHOD/PROPERTYGET "Open" at OLEWord.pl line 20
Update: I can open up the Word application fine, it's just when I try to open up the file that is the problem
I have few scripts to convert DOC to various output format using Win32::OLE. They usually start like this:
use Win32::OLE;
use Win32::OLE::Const 'Microsoft Word';
my $wr = Win32::OLE->new('Word.Application')
or die "Failure - word. \n";
$wr->{DisplayAlerts} = wdAlertsNone;
$wr->{Visible} = 0;
my $Doc = $wr->Documents->Open({
FileName => $input_file_path,
ConfirmConversions => 0,
AddToRecentFiles => 0,
Revert => 0,
ReadOnly => 1,
OpenAndRepair => 0,
}) or exit;
...
Please note that $input_file_path has to contain absolute path to your file. You can also enable Visible and DisplayAlerts to see any error Word might give you.
Edit: You can traverse paragraphs using in enumerator:
use Win32::OLE qw(in);
...
my $paragraphs = $Doc->Paragraphs;
for my $par (in $paragraphs) {
print $par->Range->Text();
}
Or you can use Word's own exporting method and save document as one of supported formats:
$Doc->SaveAs({
FileName => 'c:\\work\\Test.txt',
FileFormat => wdFormatEncodedText,
});
The advantage of latter method is that formatting is retained if possible, which produces better results for bullets, numbering and such.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With