Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can the body tag exist inside of a jQuery object?

Tags:

jquery

I'm trying to create script that copies user submitted HTML into a jQuery object, manipulates it, and then gives it back to the user as plain text. I have a <textarea> that the user pastes their HTML into and then submits. At which point I grab the value of that <textarea> and create the jQuery object so that I can use jQuery to modify it.

However, I've only just recently noticed that things like the <!doctype html>, <html> tag and <body> tag don't seem to be in the object. Can these things not exist in a jQuery object? I tested this by putting a <body> tag into a jQuery object and then using .find(). I didn't get any results.

Additionally, when I use this code from How do you convert a jQuery object into a string?

$('<div>').append($('#item-of-interest').clone()).html();

The <body> tag is missing. Although, I'm not sure if that's just because of the method I'm using the output a string from a jQuery object or not.

like image 496
jkupczak Avatar asked Dec 21 '25 02:12

jkupczak


2 Answers

If you follow into the jQuery code, internally, in the jQuery object constructor, once it determines that you've passed in an HTML string, then it calls jQuery.parseHTML() on that string. If you follow into the parseHTML() method, if the HTML is not a single tag only, then it then calls buildFragment() on the same HTML string and if you follow into it you will find that it discards the <body> tag. I don't know why it does that, but that's the way it is coded to behave.

So, there's this type of code flow:

jQuery object constructor
    determine if argument is an HTML string
    call jQuery.parseHTML() on the HTML string
       if string is not a single tag by itself, 
           then call jQuery.buildFragment() on the string
           jQuery.buildFragment() seems to ignore the outer tag container

I have not been able to figure out why buildFragment() ignores the outer <body>other content here</body>, but it does.

On further study of buildFragment(), it correctly parses the outer tag as <body>, but as long as that tag isn't a tag type that needs some special treatment (such as the kinds of things that can only exist inside of tables), it completely ignores what type that outer tag was and forces it to be a <div>. That outer container is then ignored later, when the content is retrieved from the jQuery object. Again, I'm not sure why it does that, but that is what it does.


As for your particular problem, I think the conclusion is that you can't use jQuery's constructor to handle an entire HTML document. It just isn't built to do that.

You could search the HTML document that was given to you and extract just the part between <body> and </body>, give that to the jQuery object constructor, do your manipulations on it, then put the manipulated HTML back into the original whole HTML document between the original <body> and </body> tags, thus preserving everything that you didn't want to manipulate while using jQuery for the part internal to the <body> tag.

You should probably also be wary of <script> elements in the <body> tag as they probably aren't preserved perfectly either.

like image 91
jfriend00 Avatar answered Dec 22 '25 21:12

jfriend00


Since this is going to be used on an internal application, the function below might be of interest, even if it doesn't use jQuery (you can always call jQuery on the element that is returned)

It takes a string, and put it inside a HTML-element and let the browser handle the tag soup. It will return a html-element that always has a head and a body.

This isn't perfect, but it does a lot of the work. And with the little testing I have done it gives the same result in Chrome 36, Firefox 31, Opera 21 and Internet Explorer 11.

It strips the doctype tag, and the html-tag. If you have attributes on the html-tag they will be lost. But you get a html-element that always has a head and body, even if the input doesn't. When I tested the script-tags was not executed. I haven't tried audio/video-tags, svg etc...

With a little bit of extra code you should be able to get the attributes on the html-element, and put the doctype in a string.

function mkDom(text) {
  var html;
  html=document.createElement('html');
  html.innerHTML = text;
  return html;
}

Test with complete document:

console.log(mkDom("<!doctype html><html lang='en'><head><title>Test</title><script src='test.js'></script></head><body><p>test</p><script>alert(1);</script></body></html>").outerHTML);

<html><head><title>Test</title><script src="test.js"></script></head><body><p>test</p><script>alert(1);</script></body></html>

Test with head and body:

console.log(mkDom("<head><title>Test</title><script src='test.js'></script></head><body><p>test</p><script>alert(1);</script></body>").outerHTML);

<html><head><title>Test</title><script src="test.js"></script></head><body><p>test</p><script>alert(1);</script></body></html>

Test with body:

console.log(mkDom("<body><p>test</p><script>alert(1);</script></body>").outerHTML);

<html><head></head><body><p>test</p><script>alert(1);</script></body></html>

Test with partial body:

console.log(mkDom("<p>test</p><script>alert(1);</script>").outerHTML);

<html><head></head><body><p>test</p><script>alert(1);</script></body></html>
like image 44
some Avatar answered Dec 22 '25 21:12

some



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!