Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Trouble with data types after scraping a website with lxml and xpath

Tags:

python

xpath

lxml

I'm scraping a website for data and end up pulling out numbers. The issue is when I try to perform logic functions in Python on the data it comes back as

class 'lxml.etree._ElementStringResult'

My question is can I typecast this data somehow into a string or int so I can then do my logic statements?

Here is the code:

callType = item.xpath('.//span[contains(@id, "lblSignal")]')[0].text_content()

print callType

Here is the output:

76

When I try control statements on the data nothing happens. I think it's because I'm trying logic on incorrect types.

callType = item.xpath('.//span[contains(@id, "lblSignal")]')[0].text_content()
print type(callType)
print callType

This is my output:

<class 'lxml.etree._ElementStringResult'>
76

So instead of trying to complete control statements with an "int", it is a different type. I've tried typecasting the variable but it remains that same datatype. Hope this helps...

like image 808
Dunnage Avatar asked Oct 15 '25 13:10

Dunnage


1 Answers

xpath() may return a list of _ElementStringResults, not plain Python strings. The reason why you might sometimes wish to have _ElementStringResults is that unlike strs they remember their parents (which they make accessible through the getparent method).

You could convert this to a string or integer by simply passing the object to str or int.

for span in item.xpath('.//span[contains(@id, "lblSignal")]'):
    callType = int(span.text_content())
like image 131
unutbu Avatar answered Oct 18 '25 22:10

unutbu



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!