Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sorting list of beautifulsoup tag using python

I used the script below and extracted a list of url:

request = urllib2.Request("http://www.dummyurl.com")
pub_lv1 = urllib2.urlopen(request)
pub_lv1_parse = BeautifulSoup(pub_lv1)
pub_lv1_parse = pub_lv1_parse.body.find('table', attrs={"class":"proxy-archive-content-year-list"})
pub_lv1_parse = pub_lv1_parse.findAll('a')

The output is as below:

[<a href="/content/by/year/2011">2011</a>,
 <a href="/content/by/year/2012">2012</a>,
 <a href="/content/by/year/2013">2013</a>,
 <a href="/content/by/year/2000">2000</a>,
 <a href="/content/by/year/2001">2001</a>,
 <a href="/content/by/year/2002">2002</a>,
 <a href="/content/by/year/2003">2003</a>,
 <a href="/content/by/year/2004">2004</a>,
 <a href="/content/by/year/2005">2005</a>]

As you can see the year is not ordered, I want to sort them, I know how to sort a list of string using sort but what about output from beautifulsoup?

like image 912
lokheart Avatar asked Jun 07 '26 16:06

lokheart


1 Answers

Sort by element text:

sorted(pub_lv1_parse, key=lambda elem: elem.text)
like image 194
falsetru Avatar answered Jun 10 '26 05:06

falsetru



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!