Redirect the stdout file to a file with unicode encoding while retaining Windows eol in python 2


I hit a wall here. I need to redirect all output to a file but I need this file to be encoded in utf-8. Problem is that when using

# errLog =,u'BashBugDump.log'), 'w',
#                  encoding='utf-8')
errLog =, u'BashBugDump.log'),
                     'w', encoding='utf-8')
sys.stdout = errLog
sys.stderr = errLog

codecs opens the file in binary mode resulting in \n line terminators. I tried using but this does not play with the print statement used all over the codebase (see Python 2.7: print doesn't speak unicode to the io module? or python: TypeError: can't write str to text stream)

I am not the only one having this issue for instance see here but the solution they adopted is specific to the logging module we do not use.

See also this won't fix bug in python:

So what's the one right way for doing this in python2 ?

Option 1

Redirection is a shell operation. You don't have to change the Python code at all, but you do have to tell Python what encoding to use if redirected. That is done with an environment variable. The following code redirects both stdout and stderr to a UTF-8-encoded file:


python >out.txt 2>&1

import sys
print u"我不喜欢你女朋友!"
print >>sys.stderr, u"你需要一个新的。"

out.txt (encoded in UTF-8)


Hex dump of out.txt

0000: E6 88 91 E4 B8 8D E5 96 9C E6 AC A2 E4 BD A0 E5
0010: A5 B3 E6 9C 8B E5 8F 8B EF BC 81 0D 0A E4 BD A0
0020: E9 9C 80 E8 A6 81 E4 B8 80 E4 B8 AA E6 96 B0 E7
0030: 9A 84 E3 80 82 0D 0A

Note: You do need to print Unicode strings for this to work. Print byte strings and you get the bytes you print.

Option 2 may force binary mode, but codecs.getwriter doesn't. Give it a file opened in text mode:

import sys
import codecs
sys.stdout = sys.stderr = codecs.getwriter('utf8')(open('out.txt','w'))
print u"我不喜欢你女朋友!"
print >>sys.stderr, u"你需要一个新的。"

(same output and hexdump as above)