Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python UTF-8 conversion

I would like to ask how do the following conversion (source->target) by Python program.

>>> source = '\\x{4e8b}\\x{696d}'
>>> print source
\x{4e8b}\x{696d}
>>> print type(source)
<type 'str'>
>>> target = u'\u4e8b\u696d'
>>> print target.encode('utf-8')
事業

Thank you.

like image 719
jack Avatar asked Apr 10 '26 09:04

jack


1 Answers

Taking advantage of Blender's idea, you could use re.sub with a callable replacement argument:

import re
def touni(match):
    return unichr(int(match.group(1), 16))

source = '\\x{4e8b}\\x{696d}'
print(re.sub(r'\\x\{([\da-f]+)\}', touni, source))

yields

事業
like image 50
unutbu Avatar answered Apr 12 '26 21:04

unutbu



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!