A lot of whitespace beautifulsoup

Question

I am doing web scraping using beautifulsoup. The web page has the following source:

<td>
<a href="http://aaa.com">Charles</a>
                         (hello)
                            </td>,
<td>
<a href="http://bbb.com">Diane</a>
                           (hi)
                            </td>,
<td>
<a href="http://ccc.com">Kevin</a>
                           (how are you doing)
                            </td>

I use the following codes to print two values. They work just fine.

for item in soup.find_all("td"):
    print item.find('a').text
    print item.find('a').next_sibling

The problem is when I save the outputs in a csv file, the second column has no value. It appears because there is a lot of whitespace. Any suggestion? Thanks in advance.

alecxe · Accepted Answer

Find all the next text siblings, join them and strip:

"".join(item.find('a').find_next_siblings(text=True)).strip()

A lot of whitespace beautifulsoup

Tags:

python

html-parsing

beautifulsoup

python-2.7

kevin

1 Answers

alecxe

Recent Activity

Donate For Us

A lot of whitespace beautifulsoup

Tags:

python

html-parsing

beautifulsoup

python-2.7

kevin

1 Answers

alecxe

Related questions

Recent Activity

Donate For Us