I hit a wall here. I need to redirect all output to a file but I need this file to be encoded in utf-8. Problem is that when using
# errLog = io.open(os.path.join(os.getcwdu(),u'BashBugDump.log'), 'w', # encoding='utf-8') errLog = codecs.open(os.path.join(os.getcwdu(), u'BashBugDump.log'), 'w', encoding='utf-8') sys.stdout = errLog sys.stderr = errLog
codecs opens the file in binary mode resulting in
\n line terminators. I tried using
io.open but this does not play with the print statement used all over the codebase (see Python 2.7: print doesn't speak unicode to the io module? or python: TypeError: can't write str to text stream)
See also this won't fix bug in python: https://bugs.python.org/issue2131
So what's the one right way for doing this in python2 ?
Redirection is a shell operation. You don't have to change the Python code at all, but you do have to tell Python what encoding to use if redirected. That is done with an environment variable. The following code redirects both stdout and stderr to a UTF-8-encoded file:
set PYTHONIOENCODING=utf8 python test.py >out.txt 2>&1
#coding:utf8 import sys print u"我不喜欢你女朋友！" print >>sys.stderr, u"你需要一个新的。"
out.txt (encoded in UTF-8)
Hex dump of out.txt
0000: E6 88 91 E4 B8 8D E5 96 9C E6 AC A2 E4 BD A0 E5 0010: A5 B3 E6 9C 8B E5 8F 8B EF BC 81 0D 0A E4 BD A0 0020: E9 9C 80 E8 A6 81 E4 B8 80 E4 B8 AA E6 96 B0 E7 0030: 9A 84 E3 80 82 0D 0A
Note: You do need to print Unicode strings for this to work. Print byte strings and you get the bytes you print.
codecs.open may force binary mode, but
codecs.getwriter doesn't. Give it a file opened in text mode:
#coding:utf8 import sys import codecs sys.stdout = sys.stderr = codecs.getwriter('utf8')(open('out.txt','w')) print u"我不喜欢你女朋友！" print >>sys.stderr, u"你需要一个新的。"
(same output and hexdump as above)