I have the following test.php file, and when I run it, the closing </h1> tag gets removed.
<?php
$doc = new DOMDocument();
$doc->loadHTML('<html>
<head>
<script>
console.log("<h1>hello</h1>");
</script>
</head>
<body>
</body>
</html>');
echo $doc->saveHTML();
Here is the result when I execute the file:
PHP Warning: DOMDocument::loadHTML(): Unexpected end tag : h1 in Entity, line: 4 in /home/ryan/NetBeansProjects/blog/test.php on line 14
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html>
<head>
<script>
console.log("<h1>hello");
</script>
</head>
<body>
</body>
</html>
So, why is it removing the tag? It's a string so shouldn't it ignore it?
The only solution that comes to mind is to preg match the script tags, then replace them with a temporary holder like <script id="myuniqueid"></script> and at the end of dom management replace again with the actual script, like this:
// The dom doc
$doc = new DOMDocument();
// The html
$html = '<html>
<head>
<script>
console.log("<h1>hello</h1>");
</script>
</head>
<body>
</body>
</html>';
// Patter for scripts
$pattern = "/<script([^']*?)<\/script>/";
// Get all scripts
preg_match_all($pattern, $html, $matches);
// Only unique scripts
$matches = array_unique( $matches[0] );
// Construct the arrays for replacement
foreach ( $matches as $match ) {
// The simple script
$id = uniqid('script_');
$uniqueScript = "<script id=\"$id\"></script>";
$simple[] = $uniqueScript;
// The complete script
$complete[] = $match;
}
// Replace the scripts with the simple scripts
$html = str_replace($complete, $simple, $html);
// load the html into the dom
$doc->loadHTML( $html);
// Do the dom management here
// TODO: Whatever you do with the dom
// When finished
// Get the html back
$html = $doc->saveHTML();
// Replace the scripts back
$html = str_replace($simple, $complete, $html);
//Print the result
echo $html;
This solution prints clean without dom errors.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With