Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to grep lines between two patterns in a big file with python

Tags:

python

grep

lines

I have a very big file, like this:

[PATTERN1]
line1
line2
line3 
...
...
[END PATTERN]
[PATTERN2]
line1 
line2
...
...
[END PATTERN]

I need to extract in another file, lines between a variable starter pattern [PATTERN1] and another define pattern [END PATTERN], only for some specific starter pattern.
For example:

[PATTERN2]
line1 
line2
...
...
[END PATTERN]

I already do the same thing, with a smaller file, using this code:

FILE=open('myfile').readlines()

newfile=[]
for n in name_list:
    A = FILE[[s for s,name in enumerate(FILE) if n in name][0]:]
    B = A[:[e+1 for e,end in enumerate(A) if 'END PATTERN' in end][0]]
    newfile.append(B)

Where 'name_list' is a list with the specific starter patterns that I need.

It works!! but I suppose there is a better way to do this working with big files, without using the .readlines() command.
Anyone can help me?

thanks a lot!

like image 934
user1474510 Avatar asked Dec 20 '25 11:12

user1474510


1 Answers

Consider:

# hi
# there
# begin
# need
# this
# stuff
# end
# skip
# this

with open(__file__) as fp:
    for line in iter(fp.readline, '# begin\n'):
        pass
    for line in iter(fp.readline, '# end\n'):
        print line

prints "need this stuff"

More flexible (e.g. to allow re pattern matching) is to use itertools drop- and takewhile:

with open(__file__) as fp:
    result = list(itertools.takewhile(lambda x: 'end' not in x, 
        itertools.dropwhile(lambda x: 'begin' not in x, fp)))
like image 152
georg Avatar answered Dec 21 '25 23:12

georg



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!