Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extracting various lines from file using ruby

Tags:

ruby

I have a number of text files, need to extract the first instance of some single lines, some consecutive lines and some text between lines:

Document 1

Title of the document
(TOD)

Release 3
Version 2

Authors

Thomas E. Thomas, John L. John,
Fred A. Fred, Sandra K. Sandra

Company A Address

More Authors

Page 3

From this example I need "Title of the Document (TOD)", 3, 2, and all the text between Authors and Page 3, not inclusive. I'm slowly learning so I have some code snippets, but they don't go far enough. I can get a match but need the first instance, and the instance and next line:

File.open("sample.txt").each do |line|
    if line[/Document/]
        puts line

I've tried to get intervening text but it's not quite right:

File.open("sample.txt").each do |line|
while gets
  print if [/Authors/../Page/]
end

If you feel this is too much help to ask for I'd appreciate hints/pointers.

like image 673
chuckfinley Avatar asked Apr 28 '26 19:04

chuckfinley


1 Answers

Rather than read the file line by line I think it would be easier to read in the whole thing then search through it with regex. Something like:

File.open("sample.txt","r") do |f|
  text = f.read

  # everything between Document and Authors
  m1 = text.match(/Document(.*)Authors/m)

  # everything between Authors and Page
  m2 = text.match(/Authors(.*)Page/m)
end
like image 101
Dty Avatar answered Apr 30 '26 07:04

Dty