I have a number of text files, need to extract the first instance of some single lines, some consecutive lines and some text between lines:
Document 1
Title of the document
(TOD)Release 3
Version 2Authors
Thomas E. Thomas, John L. John,
Fred A. Fred, Sandra K. SandraCompany A Address
More Authors
Page 3
From this example I need "Title of the Document (TOD)", 3, 2, and all the text between Authors and Page 3, not inclusive. I'm slowly learning so I have some code snippets, but they don't go far enough. I can get a match but need the first instance, and the instance and next line:
File.open("sample.txt").each do |line|
if line[/Document/]
puts line
I've tried to get intervening text but it's not quite right:
File.open("sample.txt").each do |line|
while gets
print if [/Authors/../Page/]
end
If you feel this is too much help to ask for I'd appreciate hints/pointers.
Rather than read the file line by line I think it would be easier to read in the whole thing then search through it with regex. Something like:
File.open("sample.txt","r") do |f|
text = f.read
# everything between Document and Authors
m1 = text.match(/Document(.*)Authors/m)
# everything between Authors and Page
m2 = text.match(/Authors(.*)Page/m)
end
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With