Basically on displaying data from MySQL database I have a htmlspecialchars() function below that should convert single and double quotes to their safe entity(s). The problem I'm having is on viewing source code, it is only converting < > & when I also need it to convert single and double quotes.
//sanitize data from db before displaying on webpage
function htmlsan($htmlsanitize){
    return $htmlsanitize = htmlspecialchars($htmlsanitize, ENT_QUOTES, 'UTF-8');
}
Then when I want to use for example I do:
htmlsan($row['comment']);
Can someone tell me why it's not converting single and double quotes?
UPDATE
What's strange is htmlsan() is used on comment in email and when I view source code of email it converts them, it seems that it won't convert the single/double quotes from the database on displaying on webpage. My database collation is also set to utf8_general_ci and I declare I am using utf8 on database connection etc.
The htmlspecialchars() function is used to converts special characters ( e.g. & (ampersand), " (double quote), ' (single quote), < (less than), > (greater than)) to HTML entities ( i.e. & (ampersand) becomes &, ' (single quote) becomes ', < (less than) becomes < (greater than) becomes > ).
You use htmlspecialchars EVERY time you output content within HTML, so it is interpreted as content and not HTML. If you allow content to be treated as HTML, you have just opened the door to bugs at a minimum, and total XSS hacks at worst.
Use the PHP htmlspecialchars() function to convert special characters to HTML entities. Always escape a string before displaying it on a webpage using the htmlspecialchars() function to prevent XSS attacks.
Difference between htmlentities() and htmlspecialchars() function: The only difference between these function is that htmlspecialchars() function convert the special characters to HTML entities whereas htmlentities() function convert all applicable characters to HTML entities.
How are you exactly testing it?
<?php
//sanitize data from db before displaying on webpage
function htmlsan($htmlsanitize){
    return $htmlsanitize = htmlspecialchars($htmlsanitize, ENT_QUOTES, 'UTF-8');
}
var_dump(htmlsan('<>\'"'));
... prints:
string(20) "<>'""
My guess is that your input string comes from Microsoft Word and contains typographical quotes:
var_dump(htmlsan('“foo”')); // string(9) "“foo”" 
If you do need to convert them for whatever the reason, you need htmlentities() rather than htmlspecialchars():
var_dump(htmlentities('“foo”', ENT_QUOTES, 'UTF-8')); // string(17) "“foo”"
Alright, it's time for some proper testing. Type a single quote (') in your comment database field and run the following code when you retrieve it:
var_dump(bin2hex("'"));
var_dump(htmlspecialchars("'", ENT_QUOTES, 'UTF-8'));
var_dump(bin2hex($row['comment']));
var_dump(htmlspecialchars($row['comment'], ENT_QUOTES, 'UTF-8'));
It should print this:
string(2) "27"
string(6) "'"
string(2) "27"
string(6) "'"
Please update your question and confirm whether you ran this test and got the same or a different output.
Please look carefully at the output you claim to be obtaining:
string(6) "'"
That's not a string with 6 characters. You are not looking at the real output: you are looking at the output as rendered by a browser. I'm pretty sure you are getting the expected result, i.e. string(6) "'". If you render ' with a web browser it becomes '. Use the View Source menu in your browser to see the real output.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With