Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Selenium get next tr given condition holds for previous tr

Set-up

I'm trying to scrape infoboxes on French regions on Wikipedia.

To be specific, I need to obtain the population of each region. For each region, its population is stated in the infobox on each wiki page, e.g. see https://en.wikipedia.org/wiki/Mayotte.


HTML

For the example page, the part of the infobox html I'm interested looks as follows,

<tr class="mergedtoprow">
   <th colspan="2" style="text-align:center;text-align:left">Area
       <div style="font-weight:normal;display:inline;"></div></th></tr>
<tr class="mergedrow">
   <th scope="row">&nbsp;•&nbsp;Total</th> 
       <td>374&nbsp;km<sup>2</sup> (144&nbsp;sq&nbsp;mi)</td></tr>
<tr class="mergedtoprow">
   <th colspan="2" style="text-align:center;text- align:left">
       Population 
       <div style="font-weight:normal;display:inline;">
            (2017)
            <sup id="cite_ref-census_1-0" class="reference">
                 <a href="#cite_note-census-1">[1]</a>
            </sup>
       </div>
   </th>
</tr>
<tr class="mergedrow">
   <th scope="row">&nbsp;•&nbsp;Total</th>
   <td>256,518</td>
</tr>

I need to get the population number 256,518.


Code

My plan is to select the tr containing the 'Population' string and then tell selenium to select the tr after it.

The following code successfully selects the tr containing the 'Population' string,

info_box = browser.find_elements_by_css_selector('.infobox').find_element_by_xpath('tbody')

for row in info_box.find_elements_by_xpath('./tr'):

    if 'Population' in row.text:

        print(row) 

Now! How do I tell Selenium to select the tr after the selected tr?

like image 706
LucSpan Avatar asked Nov 19 '25 08:11

LucSpan


2 Answers

No need to iterate over all rows. You just need to select required row

Try this code line to get required output:

population = driver.find_element_by_xpath('//tr[contains(th, "Population")]/following-sibling::tr/td').text
print(population)
#  256,518
like image 112
Andersson Avatar answered Nov 20 '25 21:11

Andersson


i think this should be good enough

info_box = browser.find_elements_by_css_selector('.infobox').find_element_by_xpath('tbody')
tr_data = info_box.find_elements_by_xpath('./tr')
for row in range(0, len(tr_data)):

    if 'Population' in tr_data[row].text:

        print(tr_data[row + 1].text) 
        break
like image 43
Nihal Avatar answered Nov 20 '25 22:11

Nihal



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!