More than 5 years have passed since last update.

Python2.xのprint文でunicode文字列のエンコーディングを指定する

Python

Last updated at 2015-10-28Posted at 2015-10-28

Python2.xのprint文は、unicode型をstr型に暗黙変換してから標準出力をします。
unicode --> str へ変換する際のエンコードは locale.getpreferredencoding() で決まります。

試してみる

↓MacのTerminalからPython実行してみたとき。

>>> import locale
>>> locale.getpreferredencoding()
'UTF-8'

>>> print u'あああ'
あああ

↓Windows7のコマンドプロンプトからPython実行してみたとき。

>>> import locale
>>> locale.getpreferredencoding()
'cp932'

>>> print u'あああ'
あああ

↓EclipseのPyDev ConsoleからPython実行してみたとき。

>>> import locale
>>> locale.getpreferredencoding()
'US-ASCII'

>>> print u'あああ'
Traceback (most recent call last):
  File "<input>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-2: ordinal not in range(128)

特殊な環境でasciiだったので u'あああ' はUnicodeEncodeErrorエラーになります。

指定したエンコーディングで変換してほしい

あえてエンコーディングを指定したい場合、以下の方法で出来ます。

>>> import sys, codecs
>>> sys.stdout = codecs.getwriter('utf-8')(sys.stdout)

>>> print u'あああ'
あああ

エラーを出さないようにする

バグを隠蔽してしまう可能性大ですが...

_ENC_LOCALE = locale.getpreferredencoding()
sys.stdout = codecs.getwriter(_ENC_LOCALE)(sys.stdout, errors='replace')
sys.stderr = codecs.getwriter(_ENC_LOCALE)(sys.stderr, errors='replace')

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up