Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Stop Jsoup from encoding

Tags:

java

jsoup

I'm trying to parese an URL with JSoup which contains the following Text: Ætterni. After parsing the document the same string looks like that: Ætterni.

How do I prevent this form happening? I want the document 1:1 exactly like it was.

Code:

doc = Jsoup.connect(url).get();
String docEncoding=doc.outputSettings().charset().name();
OutputStreamWriter writer = new OutputStreamWriter(new FileOutputStream(localLink),docEncoding);
writer.write(doc.html());
writer.close();
like image 954
Markus Avatar asked Jan 27 '26 12:01

Markus


1 Answers

Use doc.outputSettings().escapeMode(EscapeMode.xhtml); for avoiding entities conversion.

like image 173
tonig Avatar answered Jan 30 '26 03:01

tonig