Python - Change the string to utf8

advertisements

I am trying to write Portuguese to an HTML file but I am getting some funny characters. How do I fix this?

first = """<p style="color: red; font-family: 'Liberation Sans',sans-serif">{}</p>""".format(sentences1[i])
f.write(first)

Expected Output: Hoje, nós nos unimos ao povo...

Actual Output in browser (Firefox on Ubuntu): Hoje, nós nos unimos ao povo...

I tried doing this:

first = """<p style="color: red; font-family: 'Liberation Sans',sans-serif">{}</p>""".format(sentences1[i])
f.write(first.encode('utf8'))

Output in terminal: UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 65: ordinal not in range(128)

Why am I getting this error and also how can I write other languages to an HTML doc without the funny characters?
Or, is there a different file type that I can write to with the above font formatting?


Your format string should be a Unicode string too:

first = u"""<p style="color: red; font-family: 'Liberation Sans',sans-serif">{}</p>""".format(sentences1[i])
f.write(first)