Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why did Jsoup move an HTML element when it was parsed?

Tags:

java

html

jsoup

Here is my sample input HTML Codes:

<html>
<head>
<object></object>
</head>
<body>
</body>
</html>

Below is the output when parsed using Jsoup:

<html>
 <head> 
 </head>
 <body>
  <object></object>    
 </body>
</html>

Question: Why did Jsoup move the <object> tag from the <head> to the <body>?

like image 436
Rolando Olivo Avatar asked Dec 11 '25 00:12

Rolando Olivo


1 Answers

This is correct behaviour since <object> must appear inside the body.

HTML Tag

[...]

Tips and Notes

Note: An element must appear inside the element. The text between the and is an alternate text, for browsers that do not support this tag.

http://www.w3schools.com/tags/tag_object.asp


If you want the object within the head, you can use the XmlParser instead:

    final String html = "<html>\n"
            + "<head>\n"
            + "<object></object>\n"
            + "</head>\n"
            + "<body>\n"
            + "</body>\n"
            + "</html>";

    Document doc = Jsoup.parse(html, "", Parser.xmlParser());
    //                                   |<-------------->|
like image 54
ollo Avatar answered Dec 12 '25 14:12

ollo



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!