Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Login to a website through web-scraping tool in Python

I am using Selenium webdriver in Python for a web-scraping project.

I would like to login by entering the login details and then click the submit button.

I am able to enter the Username and Password. But I am not able to mouseclick the submit button.

The "submit" button is of type <input>.

<input type="image" src="/images/buttons/loginnow.gif" tabindex="3">

Here is the python code where I am trying to click the mouse.

submitButton=driver.find_element_by_xpath("//input[@type='image'][@src='/images/buttons/loginnow.gif']")
driver.click(submitButton)

I get the following error :

AttributeError: 'WebDriver' object has no attribute 'click'

Any idea how to fix it or any other alternative solution to login to a website in Python.

Thanks

like image 610
Kiran Avatar asked Nov 29 '11 01:11

Kiran


People also ask

How do I login to a website using python?

It's quite easy to use and should be able to do what you want. You can use showforms() to list all forms once you used go… to browse to the site you want to login. Just try it from the python interpreter. note that in some cases you need to use submit().

Can you use python to scrape a website?

Instead of looking at the job site every day, you can use Python to help automate your job search's repetitive parts. Automated web scraping can be a solution to speed up the data collection process. You write your code once, and it will get the information you want many times and from many pages.

Can websites detect web scraping?

Web pages detect web crawlers and web scraping tools by checking their IP addresses, user agents, browser parameters, and general behavior. If the website finds it suspicious, you receive CAPTCHAs and then eventually your requests get blocked since your crawler is detected.


2 Answers

I had good luck using mechanize. It's pretty straightforward and simple to use.

Here's a stripped-down version of a script I made:

from BeautifulSoup import BeautifulSoup
from tidylib import tidy_document

import mechanize
import cookielib

if __name__ == '__main__':
  browser = mechanize.Browser()

  cookiejar = cookielib.LWPCookieJar()
  browser.set_cookiejar(cookiejar)

  browser.set_handle_equiv(True)
  browser.set_handle_redirect(True)
  browser.set_handle_referer(True)
  browser.set_handle_robots(False)

  browser.open('https://www.example.com/')

  browser.select_form(name = 'loginform')
  browser['username'] = 'foo'
  browser['password'] = 'bar'

  browser.submit()

  browser.open(browser.click_link(text = 'Link text'))

  soup = BeautifulSoup(tidy_document(browser.response().read())[0])

You don't need to click on the image, really. You just need to fill out all the appropriate form details and just submit() it.

Also, if you won't be parsing anything, just get rid of the BeautifulSoup and tidylib dependencies.

like image 83
Blender Avatar answered Sep 20 '22 14:09

Blender


You need to call the click function of the element, not the driver.

submitButton=driver.find_element_by_xpath("//input[@type='image'][@src='/images/buttons/loginnow.gif']")
submitButton.click()
like image 38
Acorn Avatar answered Sep 20 '22 14:09

Acorn



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!