I'm interested in microtypography issues on the web.
I want a tool to fix:
All those fixes depend on the content language. In French, for example, we must add a insecable (non-breaking) space before every composed glyph (:, ;, …, ?, !, ...), and our quotes are « like this ».
There are many constraints for such a tool:
pre, code...)There already are some tools on the market:
They are all more or less based on SmartyPants, a 2005 lib, not tested, not documented, parsing HTML manually and not dealing with other rules than English. Hell no.
So my questions are:
Edit July 2013: I have developed JoliTypo from the tests and expertise I gained with this issue. No existing lib was doing what I wanted to do.
My somewhat-friend Sean built something that I use for this purpose quite often. You can view the demo here: http://files.seancoates.com/lexentity/ he blogged about it here: http://seancoates.com/blogs/lexentity and you can grab the source here: https://github.com/scoates/lexentity
It might not meet your full language needs, but it's a start with English.
You might be interested in tidy. It is boundled with PHP 5+ (all you need to use it is libtidy). It not just parses HTML, but repairs it too.
But with the localization, you are on your own - intl does not have any data about quotes - f.ex.; at least i could not found them.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With