Some XML file I ran across is failing a well-formed XML check, even though it looks well-formed to me (I might be wrong.)
I have reduced it to a trivial example:
<?xml version="1.0" encoding="Cp1252"?>
<jnlp/>
The method being used to do the check works like this:
public static boolean isWellFormedXml(InputStream inputStream) {
    try {
        XMLInputFactory inputFactory = XMLInputFactory.newInstance();
        inputFactory.setProperty(XMLInputFactory.IS_COALESCING, false);
        inputFactory.setProperty(XMLInputFactory.SUPPORT_DTD, false);
        XMLStreamReader reader = inputFactory.createXMLStreamReader(stream);
        try {
            // Scan through all the reader tokens to ensure everything is well formed
            while (reader.hasNext()) {
                reader.next();
            }
        } finally {
            reader.close();
        }
    } catch (XMLStreamException e) {
        // Ignore the exception
        return false;
    }
    return true;
}
The error I'm seeing is:
javax.xml.stream.XMLStreamException: ParseError at [row,col]:[1,40]
Message: Invalid encoding name "Cp1252".
Only problem is - I can breakpoint at the catch and confirm that this encoding name does resolve. So what's the deal here? Does XML also restrict which encodings you're allowed to use in the prologue?
check:
http://www.iana.org/assignments/character-sets/character-sets.xml
i guess the encoding you're looking for COULD be windows-1252. Cp1252 might be a valid charset in java, but in XML, you're not supposed to use it (by that name).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With