Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Grab multiple non-contiguous lines with python

Tags:

python

text

I have a text file that I want to be able to find any instance of "Acct Nbr:" and grab the preceding lines of "Queue Number:", "Queue Description:" and the 5th line after Queue Number called ApplNbr. It won't always be the 5th line if there are other apps before it. This line is always 3 columns of numbers. For example: "5024 17 110" or "964 16 100"

As you can see, there are at times multiple "Acct nbr:" entries per queue but most times there is only one. It would be nice to have multiple Acct nbr: under one Queue number line if possible. There are thousands of entries in this text file and I only need to look for these errors that are identified by Acct nbr: but I need the queue number, que description and the appl nbr for each error. I hope I'm being clear.

I marked the desired lines with "->". I would like to use python but I'm open to other scripting languages like powershell or using gnu utilities like grep.

Thank you for your time and attention.

Sample of text to parse:

->Queue Number: 87125  Queue Effective Date:  09-05-2012  Queue Scheduled Date:  09-05-2012

->Queue Description: **Posting File** Processing  Queue Type Description: PM File Load Copy

  Appl QSub  Seq  Appl                   Appl      Return   Start     Stop   Time of Run

  Nbr  Nbr   Nbr  Description            Name      Code     Time      Time   (In Minutes)

->386  2     0    Inclearing Processing  AH_CLEAR  0      12:07:21  12:07:56       0.583

  Procedure Complete
  ***************  Batch Application Errors  ***************

->Acct Nbr: 0000000000      Batch Actv Msg: Bank Check Not Processed - Validation Error

->Batch Oracle Msg:

->Acct Nbr: 0000000000      Batch Actv Msg: Bank Check Not Processed - Validation Error

->Batch Oracle Msg:

Desired output:

Queue Number: 87125  Queue Effective Date:  09-05-2012  Queue Scheduled Date:  09-05-2012

Queue Description: **Posting File** Processing  Queue Type Description:  PM File Load Copy

386    2      0  Inclearing Processing     AH_CLEAR  0      12:07:21  12:07:56       0.583

Acct Nbr: 0000000000      Batch Actv Msg: Bank Check Not Processed - Validation Error
Batch Oracle Msg:

Acct Nbr: 0000000000      Batch Actv Msg: Bank Check Not Processed - Validation Error
Batch Oracle Msg:
like image 551
user1669104 Avatar asked Mar 23 '26 21:03

user1669104


1 Answers

Iterate every line with a state machine that stores the latest Queue Number and Queue Description. When you find the Acct Nbr use the stored values to generate your output.

Quick untested example, please adapt to your own needs:

class State(object):

    qnumb = None
    qdesc = None

    def feed(self, line):
        if line.startswith('Queue Number'):
            self.qnumb = line
        elif line.startswith('Queue Description'):
            self.qdesc = line
        elif line.startswith('Acct Nbr'):
            return line, self.qnumb, self.qdesc


def parse(lines):
    results = []
    s = State()
    for line in lines:
        entry = s.feed(line)
        if entry:
            results.append(entry)
    return results
like image 108
Ali Afshar Avatar answered Mar 26 '26 10:03

Ali Afshar



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!