More than 5 years have passed since last update.

Python curses で日本語を入力する方法

Last updated at 2020-04-12Posted at 2020-04-12

環境

Mint 19.10
Python 3.7.5

はじめに

cursesのgetch() を使った場合にUTF-8でエンコードしたときに日本語(マルチバイト文字)がうまく受け取れなかったので、処理部分を作った。

症状

例えば 'あ' と打った場合には、0xe3 0x81 0x82 の3バイトに分けて、3回分入力されてしまう。しかし、これでは都合が悪い。(本当にほしいのは0x3042)

原因

getch() は1バイトずつ入力を見るので、UTF-8では3バイト分ある日本語などを正しく受け取れない。ということで、UTF-8のエンコーダを自作してやる。

解決策

先頭バイトを見ると、その文字列が何バイトか確定するから、以下のように条件分岐させる。
※擬似コードです。stdscrの部分などは省いています。windowオブジェクトを適宜宣言してください

import curses

window = curses.stdscr()
key = window.getch()

# マルチバイト文字の加工
# 日本語だと3バイトだからプールする必要がある。先頭バイトを見て
# 残りのバイスと数が確定するからその処理を行う。
text_pool = [key]
if 0x00 <= key <= 0x7f:
     # 1B だから何もしなくていい
     # ascii 互換領域
     pass
elif 0x80 <= key <= 0xbf:
     # ここは2文字目以降のはずだから入ってきたらおかしい
     print(key)
     exit(1)
elif 0xc0 <= key <= 0xdf:
     # 2B ウムラウト付き文字とか
     text_pool.append(self.window.getch())
     # text_pool => [0dAAA, 0dBBB]
     # 110a aabb 10bb bbbb <= これが text_poolの中身(10進にしたもの)
     # 0b00000aaa bbbbbbbb を取り出してchar c = (char) (data[i] & 0xff);
     # 10進数にしてkeyに代入
     a, b = text_pool
     tmp = map(lambda x: bin(x)[2:], [0b00011111 & a, 0b00111111 & b])
     tmp = ''.join(item.zfill(6) for item in tmp)
     key = int(tmp,2)
elif 0xe0 <= key <= 0xef:
     # 3B 日本語はここ
     for _ in range(2):
         text_pool.append(self.window.getch())
         a, b, c = text_pool
         # 0b 1110xxxx 10xxyyyy 10yyyyyy
         # 0d a        b        c
         tmp = map(lambda x: bin(x)[2:], [0b00001111 & a, 0b00111111 & b, 0b00111111 & c])
         tmp = ''.join([item.zfill(6) for item in tmp])
         key = int(tmp,2)
elif 0xf0 <= key <= 0xff:
# 4B 見たことないけどバグ取り
     for _ in range(3):
         text_pool.append(self.window.getch())
         a, b, c ,d = text_pool
         # 11110xxx 10xxyyyy 10yyyyzz 10zzzzzz
         tmp = map(lambda x: bin(x)[2:], [0b00000111 & a, 0b00111111 & b, 0b00111111 & c, 0b00111111 & d])
         tmp = ''.join([item.zfill(6) for item in tmp])
         key = int(tmp,2)
else:
    #特殊キー
    pass


print(chr(key))

参考

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up

Python curses で日本語を入力する方法

環境

はじめに

症状

原因

解決策

関連

参考