python - UnicodeEncodeError: 'charmap' codec can't encode - character maps to <undefined>, print function

Question

Welcome To Ask or Share your Answers For Others

python - UnicodeEncodeError: 'charmap' codec can't encode - character maps to <undefined>, print function

1 Answer

深蓝 · Answer 1 · 2021-10-16T21:19:01+0000

I see three solutions to this:

Change the output encoding, so it will always output UTF-8. See e.g. Setting the correct encoding when piping stdout in Python, but I could not get these example to work.
Following example code makes the output aware of your target charset.
```
# -*- coding: utf-8 -*-
import sys

print sys.stdout.encoding
print u"St?cker".encode(sys.stdout.encoding, errors='replace')
print u"Стоескер".encode(sys.stdout.encoding, errors='replace')
```
This example properly replaces any non-printable character in my name with a question mark.

If you create a custom print function, e.g. called myprint, using that mechanisms to encode output properly you can simply replace print with myprint whereever necessary without making the whole code look ugly.
Reset the output encoding globally at the begin of the software:

The page http://www.macfreek.nl/memory/Encoding_of_Python_stdout has a good summary what to do to change output encoding. Especially the section "StreamWriter Wrapper around Stdout" is interesting. Essentially it says to change the I/O encoding function like this:

In Python 2:
```
if sys.stdout.encoding != 'cp850':
  sys.stdout = codecs.getwriter('cp850')(sys.stdout, 'strict')
if sys.stderr.encoding != 'cp850':
  sys.stderr = codecs.getwriter('cp850')(sys.stderr, 'strict')
```
In Python 3:
```
if sys.stdout.encoding != 'cp850':
  sys.stdout = codecs.getwriter('cp850')(sys.stdout.buffer, 'strict')
if sys.stderr.encoding != 'cp850':
  sys.stderr = codecs.getwriter('cp850')(sys.stderr.buffer, 'strict')
```
If used in CGI outputting HTML you can replace 'strict' by 'xmlcharrefreplace' to get HTML encoded tags for non-printable characters.

Feel free to modify the approaches, setting different encodings, .... Note that it still wont work to output non-specified data. So any data, input, texts must be correctly convertable into unicode:
```
# -*- coding: utf-8 -*-
import sys
import codecs
sys.stdout = codecs.getwriter("iso-8859-1")(sys.stdout, 'xmlcharrefreplace')
print u"St?cker"                # works
print "St?cker".decode("utf-8") # works
print "St?cker"                 # fails
```

Categories

python - UnicodeEncodeError: 'charmap' codec can't encode - character maps to <undefined>, print function

python - UnicodeEncodeError: 'charmap' codec can't encode - character maps to <undefined>, print function

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags