Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parse HTML in Android

Tags:

java

html

android

I am attempting to parse HTML for specific data but am having issues with return characters, at least I think that's what the problem is. I am using a simple substring method to take apart the HTML as I know beforehand what I am looking for.

Here is my parse method:

public static void parse(String response, String[] hashItem, String[][] startEnd) throws Exception
{

    for (i = 0; i < hashItem.length; i++)
    {
        part = response.substring(response.indexOf(startEnd[i][0]) + startEnd[i][0].length());
        value = part.substring(0, part.indexOf(startEnd[i][1]));
        DATABASE.setHash(hashItem[i], value);
    }
}

Here is a sample of the HTML that is giving me issues

<table cellspacing=0 cellpadding=2 class=smallfont>
<tr onclick="lu();" onmouseover="style.cursor='hand'">
<td class=bodybox nowrap>&nbsp;     21,773,177,147 $&nbsp;</td><td></td>
<td class=bodybox nowrap>&nbsp;        629,991,926 F&nbsp;</td><td></td>
<td class=bodybox nowrap>&nbsp;             24,537 P&nbsp;</td><td></td>
<td class=bodybox nowrap>&nbsp;                  0 T&nbsp;</td>
<td></td><td class=bodybox nowrap>&nbsp;RT&nbsp;</td>

There are hidden return characters but when I try to add them into the string that I am trying to use it doesn't work out well, if at all. Is there a method or perhaps a better way to strip hidden characters from the HTML to make it easier to parse? Any help is greatly appreciated as always.

like image 927
Alejandro Huerta Avatar asked Dec 07 '25 09:12

Alejandro Huerta


1 Answers

If you want to make parsing very easy, try Jsoup:

This example will download the page, parse and get the text.

Document doc = Jsoup.connect("http://jsoup.org").get();

Elements tds = doc.select("td.bodybox");

for (Element td : tds) {
  String tdText = td.text();
}
like image 167
droidgren Avatar answered Dec 08 '25 22:12

droidgren



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!