I read everything there is to read about Unicode, UTF-8, encoding/decoding and everything, but I still strugle.
I made a short example snippet to illustrate my problem.
I want to print the string 'Geïrriteerd' just like it is written here. I need to use the following code to let it print properly to a file if I run it with a redirect to a file, like 'Test.py > output'
# coding=utf-8
import codecs
import sys
sys.stdout = codecs.getwriter('UTF-8')(sys.stdout)
print u'Geïrriteerd'
But if I do NOT redirect, the code above prints 'Geïrriteerd' to the terminal. If I remove the 'codecs.getwriter' line, it prints fine again to the terminal but will print 'Geïrriteerd' to the file.
How can I get this to print properly in both cases?
I am using Python 2.7 on Windows 10. I know Python 3.x handles unicode better in general, but I can't use that in my project (yet) due to other dependencies.
Since redirection is a shell operation, it makes sense to control the encoding using the shell as well. Fortunately, Python provides an environment variable to control the encoding. Given test.py:
#!python2
# coding=utf-8
print u'Geïrriteerd'
To redirect to a file with a particular encoding, use:
C:\>set PYTHONIOENCODING=utf8
C:\>test >out.txt
Running the script normally with PYTHONIOENCODING undefined will use the encoding of the terminal (in my case cp437):
C:\>set PYTHONIOENCODING=
C:\>test
Geïrriteerd
Your terminal is set up for cp850 instead of UTF-8.
Run chcp 65001.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With