Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Scraping from dropdown option value Python BeautifulSoup

I tried scraping data from the web with input dropdown with BeautifulSoup

this is value drop down

<selected name="try">
<option value="G1">1</option>
<option value="G2">2</option>
</selected>

And I try like this

soup = BeautifulSoup(url, 'html.parser')
soup['selected'] = 'G1'
data = soup.findAll("table", {"style": "font-size:14px"})
print(data)

It will get data with <table> tag each submit dropdown

but it only appears <table> for the main page, how do I get data from each dropdown?

like image 219
Ilham Riski Avatar asked Sep 02 '25 13:09

Ilham Riski


2 Answers

Try an attribute CSS selector

soup.select('option[value]')

The [] is an attribute selector. This looks for option tag elements with value attribute. If there is a parent class/id that could be used that would be helpful in case there are more drop downs available on the page.

items = soup.select('option[value]')
values = [item.get('value') for item in items]
textValues = [item.text for item in items]

With parent name attribute to limit to one dropdown (hopefully - you need to test and see if something further is required to sufficiently limit). Used with descendant combinator:

items = soup.select('[name=try] option[value]')
like image 122
QHarr Avatar answered Sep 05 '25 03:09

QHarr


You still keep using findAll() and find() to finish your job.

from bs4 import BeautifulSoup

html = """
<table style="font-size:14px">
<selected name="try">
<option value="G1">1</option>
<option value="G2">2</option>
</selected>
</table>
"""

soup = BeautifulSoup(html,"lxml")

option = soup.find("selected",{"name":"try"}).findAll("option")
option_ = soup.find("table", {"style": "font-size:14px"}).findAll("option")
print(option)
print(option_)
#[<option value="G1">1</option>, <option value="G2">2</option>]
#[<option value="G1">1</option>, <option value="G2">2</option>]
like image 39
KC. Avatar answered Sep 05 '25 01:09

KC.