Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

lxml: Obtain file current line number when calling etree.iterparse(f)

Since no one answer or comment this post, I decide to have this post rewritten.

Consider the following Python code using lxml:

treeIter = etree.iterparse(fObj)
for event, ele in treeIter:
    if ele.tag == 'logRoot':
        try:
            somefunction(ele)
        except InternalException as e:
            e.handle(*args)
    ele.clear()

InternalException is user-defined and wraps all exceptions from somefunction() besides lxml.etree.XMLSyntaxError. InternalException has well-defined handler function .handle().

fObj has "trueRoot" as top-level tag, and many "logRoot" as 2nd-level leaves.

My question is: Is there a way to record current line number when handling the exception e? *args can be replaced by any arguments available.

Any suggestion is much appreciated.

like image 293
Patrick the Cat Avatar asked Sep 02 '25 10:09

Patrick the Cat


1 Answers

import lxml.etree as ET
import io

def div(x):
    return 1/x

content = '''\
    <trueRoot>
      <logRoot a1="x1"> 2 </logRoot>
      <logRoot a1="x1"> 1 </logRoot>
      <logRoot a1="x1"> 0 </logRoot>            
    </trueRoot>
    '''
for event, elem in ET.iterparse(io.BytesIO(content), events=('end', ), tag='logRoot'):
    num = int(elem.text)
    print('Calling div({})'.format(num))
    try:
        div(num)
    except ZeroDivisionError as e:
        print('Ack! ZeroDivisionError on line {}'.format(elem.sourceline))

prints

Calling div(2)
Calling div(1)
Calling div(0)
Ack! ZeroDivisionError on line 4
like image 196
unutbu Avatar answered Sep 04 '25 22:09

unutbu