More than 3 years have passed since last update.

【Python】受信したメールの添付ファイル名に日本語が含まれているときの対応

Posted at 2021-02-05

環境

python 3.7.5

省くこと

gmailのアプリパスワード取得の手順は省きます。

メールサーバーにログイン

mail.py

import imaplib
import email

# 自身のメールアドレス
gmail_address = 'hoge@gmail.com'
# googleアカウントで取得したアプリパスワード
gmail_app_password = 'hogehogehogehoge'

gmail = imaplib.IMAP4_SSL("imap.gmail.com", '993')
gmail.login(gmail_address, gmail_password)
# select()でボックスを選ぶのですがここでは選択しない
gmail.select()

メール一覧を取得

mail.py

# ALLの部分を変更することで検索をかけることができます
head, data = gmail.search(None, 'ALL')
for num in data[0].split():
    result, email_data = gmail.fetch(num, '(RFC822)')
    raw_email = email_data[0][1]
    raw_email_string = raw_email.decode("utf-8")
    email_message = email.message_from_string(raw_email_string)
    
    body = ''
    attachments = []

    for part in email_message.walk():
        if part.get_content_type() == "text/plain":
            charset = part.get_content_charset()
            # 本文に合わせてdecode
            payload = part.get_payload(decode=True)
            if charset is not None:
                payload = payload.decode(charset, "ignore")
                body = self.__delete_emoji(payload)
            else:
                # 基本的にこれで適切なfile_nameが取れるのですが、日本語が含まれていると、decodeしなければなりません。
                file_name = part.get_filename()
                # fileがある場合のみ実行
                if file_name:
                    # 日本語の場合はうまく取得できないのでraw_dataを取得
                    file_name_raw = email.header.decode_header(part.get_filename())
                    # decodeされる前のfilenameを取得
                    encoded_file_name = file_name_raw[0][0]
                    # filenameのcharsetを取得
                    file_charset = file_name_raw[0][1]
                    # charsetがある場合のみdecodeしてあげる
                    if file_charset:
                        file_name = encoded_file_name.decode(file_charset)

                    attachment = {
                        'name': file_name,
                        'data': part.get_payload(decode=True)
                    }
                    attachments.append(attachment)

part.get_filename()では日本語を含むものをうまくdecodeしてくれなかったので、自前でdecodeで対応できました。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up