i have some html content that all of its texts are Persian ! i want to give this content to DOMDocument by method DOMDocument::loadHTML($html) to do some stuff and then give it back by DOMDocument::saveHTML() ... but there is a problem in showing characters :-( for example "سلام" changed to "سلام", even I changed my script file encoding to UTF-8 but it doesn't work.
<?php
$html = "<html><meta charset='utf-8' /> سلام</html>";
$doc = new DOMDocument('1.0', 'utf-8');
$doc->loadHTML($html);
print $html; // output : سلام
print $doc->saveHTML(); // output : سلام
print $doc->saveHTML($doc->documentElement); // output : Ø³ÙØ§Ù
?>
UPDATE: according to friends instruction, i used $doc->loadHTML(mb_convert_encoding($html, 'HTML-ENTITIES', 'UTF-8')); and it worked !
Tell the XML parser that the data being read is UTF-8 encoded:
<?php
// original input (unknown encoding)
$html = '<html>سلام</html>';
$doc = new DOMDocument();
// specify the input encoding
$doc->loadHTML('<?xml encoding="utf-8"?>' . $html);
// specify the output encoding
$doc->encoding = 'utf-8';
// output: <html><body><p>سلام</p></body></html>
print $doc->saveHTML($doc->documentElement);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With