Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

I'm trying to get page source code using Selenium, but I got empty page

Tags:

java

selenium

I'm trying to get page source code using Selenium, the code is general SOP. it worked out for Baidu.com and example.com. but when it comes to the URL i actually need,I got empty page.and the source code show nothing but empty tags like the following code. is there anything I missing?

I tried to add up some more params of options, but it doesn't seem helpful

WebDriver driver;

    System.setProperty("webdriver.chrome.driver", "E:\\applications\\ChromeDriver\\chromedriver_win32 (2)//chromedriver.exe");

    // 实例化一个WebDriver的对象    作用:启动谷歌浏览器
    driver = new ChromeDriver();

    driver.manage().timeouts().implicitlyWait(2, TimeUnit.SECONDS);

    driver.get("http://rd.huangpuqu.sh.cn/website/html/shprd/shprd_tpxw/List/list_0.htm");
    String pageSource = driver.getPageSource();
    String title = driver.getTitle();
    System.out.println("==========="+title+"==============");
    System.out.println(Jsoup.parse(pageSource)); 

I expect the parsed page source of the URL so that I can get the info I need. but I'm stuck in here.

like image 369
HneryInSH Avatar asked Oct 26 '25 10:10

HneryInSH


1 Answers

I could reproduce the issue with this website when using ChromeDriver. What I found is that there is a JS detecting that you are using ChromeDriver and blocks the request to the web page with 400 HTTP error code:

enter image description here

Now, Firefox is working as expected with the following code:

    FirefoxDriver driver = new FirefoxDriver();

    driver.get("http://rd.huangpuqu.sh.cn/website/html/shprd/shprd_tpxw/List/list_0.htm");
    Thread.sleep(5000);
    String pageSource = driver.getPageSource();
    String title = driver.getTitle();
    System.out.println("==========="+title+"==============");
    System.out.println(Jsoup.parse(pageSource));

    driver.quit();

I used just a sleep for 5 seconds which worked. The best practice is to wait for a specific element in your page, check this for reference - How to wait until an element is present in Selenium?

firefox browser version: 67.0.1 geckodriver 0.24.0 selenium version: 3.141.59

like image 135
Adi Ohana Avatar answered Oct 29 '25 01:10

Adi Ohana



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!