How to output Python3 (unicode) strings while ignoring un-encodable characters

halloleo Source

Consider the following terminal command line

python3 -c 'print("hören")'

In most terminals this prints "hören" (German for "to hear"), in some terminals you get the error

UnicodeEncodeError: 'ascii' codec can't encode character '\xf6' 
in position 1: ordinal not in range(128)

In my Python3 program I don't want that just printing out something can raise an exception like this, instead I'd rather like to output the characters which will not raise an exception.

So my questions is: How can I output in Python3 (unicode) strings while ignoring un-encodable characters?


Some notes

What I've tried so far

  1. I tried using sys.stdout.write instead ofprint, but the encoding problem still can occur.

  2. I tried encoding the string in byes via

    bytes=line.encode('utf-8')
    

    This never raises an exception on print, but even in capable terminals non-ascii characters are replaced by their code point numbers.

  3. I tried using the decode method with the 'ignore' parameter:

    bytes=line.encode('utf-8')
    decoded=bytes.decode('utf-8', 'ignore')
    print(decoded)
    

    But the problem is not the decoding in the string but the enconding in the print function.

Here some terminals which appear not to be capable of all characters

  • bash shell inside Emacs on macOS.

  • Receiving a "printed" string in Applescript via do shell script, e.g.:

    set txt to do shell script "/usr/local/bin/python3 -c \"print('hören')\" "
    

Update: These terminals all return from locale.getpreferredencoding()the value US-ASCII.

python-3.xunicodecharacter-encoding

Answers

answered 8 months ago Ernesto #1

My preferred way is to set PYTHONIOENCODING variable depending on the terminal you are using.

For UTF-8-enabled terminals you can do:

export PYTHONIOENCODING='utf-8'

For printing '?'s in ASCII terminals, you can do:

export PYTHONIOENCODING='ascii:replace'

Or even better, if you don't care about the encoding, you should be able to do:

export PYTHONIOENCODING=':replace'

comments powered by Disqus