Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

lxml Changing Unicode Characters

Tags:

python

xml

lxml

I am using lxml to read through an xml file and change a few details. However, when running it I find that even if I just use lxml to read the file and then write it out again, as below:

fil='iTunes Music Library.XML'
tre=etree.parse(fil)
tre.write('temp.xml')

I find Queensrÿche converted to Queensrÿche. Anyone know how to fix this?

like image 986
Nikwin Avatar asked Mar 14 '26 19:03

Nikwin


1 Answers

Change your last line to:

tre.write('temp.xml', encoding='utf-8')

Otherwise lxml writes XML in ASCII encoding, so it have to escape all non-ASCII characters.

like image 92
Denis Otkidach Avatar answered Mar 17 '26 09:03

Denis Otkidach



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!