Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Android - Webview HTML code extraction doesn't work (Javascript)

I'm coding an app which: - load a URL in a Webview; - extract the HTML thorugh a javascript code; - show the extracted HTML code in the LOG.

As i need to load the page without Javascript enabled (to avoid some behaviors of the page), i tried the code below where: - i load the page in the webview with the Javascript disabled; - when the page is loaded, i enable the Javascript; - then, the app execute the Javascript required to extract the HTML code.

Unfortunately, when the code is executed in debug mode on Android 4.0.4, it gives an error:

01-22 22:37:56.575: E/Web Console(7605): Uncaught TypeError: Cannot call method 'processHTML' of undefined at null:1

If i remove the myBrowserSettings.setJavaScriptEnabled(false); declaration, after the loadurl call, everything works correctly.

What i can do to let the code below works?

package com.stefano.formfiller;

import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;

import android.app.Activity;
import android.content.Intent;
import android.os.Bundle;
import android.os.Handler;
import android.util.Log;
import android.view.View;
import android.webkit.CookieManager;
import android.webkit.CookieSyncManager;
import android.webkit.WebChromeClient;
import android.webkit.WebSettings;
import android.webkit.WebView;
import android.webkit.WebViewClient;
import android.webkit.WebSettings.PluginState;

public class MainActivity extends Activity {

    WebView myBrowser;
    String urlToBrowse = "http://www.mywebsite.com";
    String htmlCode = null;
    StringBuffer buffer = new StringBuffer();

    @Override
    protected void onCreate(Bundle savedInstanceState) 
    {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.activity_main);

          myBrowser = (WebView)findViewById(R.id.webView1);

          //Browser settings
          WebSettings myBrowserSettings = myBrowser.getSettings();

          //Prevent cache to be used
          myBrowserSettings.setCacheMode(WebSettings.LOAD_NO_CACHE);
          myBrowserSettings.setAppCacheEnabled(false);

          //General settings
          myBrowserSettings.setJavaScriptEnabled(true);
          Log.d("Stefano", "JS enabled");

          //FIREFOX user agent
          myBrowserSettings.setUserAgentString("Mozilla/5.0 (Windows NT 6.1; WOW64; rv:34.0) Gecko/20100101 Firefox/34.0");


          myBrowser.setWebChromeClient(new WebChromeClient());
          myBrowser.setWebViewClient(new WebViewClient() {
              public void onPageFinished(WebView view, String url) 
              { 

                    WebSettings myBrowserSettings = myBrowser.getSettings();
                    myBrowserSettings.setJavaScriptEnabled(true);
                    Log.d("Stefano", "JS enabled");

                    Log.d("Stefano", "OnPageFinished running"); 

              } });


          //Start the delayed HTML code extraction
          delayedStartHtmlExtractor(16000);
          Log.d("Stefano", "DelayedStart HTML Extractor launched");

          //Prepare Javascript to extract the HTML code from the webview
          myBrowser.addJavascriptInterface(new LoadListener(), "HTMLOUT");

          myBrowser.loadUrl(urlToBrowse);
          Log.d("Stefano", "Main URL requested");

          myBrowserSettings.setJavaScriptEnabled(false);
          Log.d("Stefano", "JS disabled");
    }   


    //Delayed HTML extraction
    public void delayedStartHtmlExtractor(final int delay){
        Handler handler = new Handler();

        handler.postDelayed(new Runnable() 
        {

            @Override
            public void run() 
            {                           


                myBrowser.loadUrl("javascript:window.HTMLOUT.processHTML('<html>'+document.getElementsByTagName('html')[0].innerHTML+'</html>');");
                Log.d("Stefano", "HTML extraction launched");

                        }
            }, delay);
    }

    //Insert the HTML code in the log information

    class LoadListener{
        public void processHTML(String html)
        {
            Log.d("Stefano", "HTML Extraction in progress...");


            Log.e("HTML CODE",html);
        }
    }

Update: I have a doubt: the code instantiate the Javascript interface while the Javascript is enabled (trough myBrowser.addJavascriptInterface(new LoadListener(), "HTMLOUT");); then, i disable the javascript after the URL call, to re-enable the Javascript when the page is fully loaded.

Could it be that when i disable the Javascript with the instantiated Interface, i "cut-off the communication channel" between the Javascipt and the Java code?

like image 772
Stefano Avatar asked Mar 23 '26 00:03

Stefano


2 Answers

add myBrowser.loadData(...) after the interface setting, like this

myBrowser.addJavascriptInterface(new LoadListener(), "HTMLOUT");
myBrowser.loadData("", "text/html", null);
myBrowser.loadUrl(urlToBrowse);

Also since you will disable js by the end of oncreate method there is no need to enable it on its debut :)

Hope this help

like image 104
medhdj Avatar answered Mar 25 '26 14:03

medhdj


First of all, you should attach the proper annotation @JavascriptInterface to the methods that will be called through the Javascript interface; in your case:

    //..
    @JavascriptInterface
    public void processHTML(String html) {
        Log.d("Stefano", "HTML Extraction in progress...");
        Log.e("HTML CODE",html);
    }
    //..

"Note that injected objects will not appear in JavaScript until the page is loaded"

I suppose that loading a page with setJavaScriptEnabled(false) will not inject any Javascript object at all, and this is way you are experiencing this problem.

A possible workaround (untested) could be this:

  • always load the page using setJavaScriptEnabled(true)
  • load the webpage passing through http://www.google.com/gwt/n (will load the page without JS or Flash)
  • do your processing
like image 28
bonnyz Avatar answered Mar 25 '26 12:03

bonnyz



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!