More than 3 years have passed since last update.

pandas-profilingの日本語文字化け対応

Last updated at 2021-06-19Posted at 2021-06-19

0.Intro

ProbSpaceやNishikaのデータは大抵日本語なのでグラフに日本語が表示されないのは割と致命的です、全部アルファベットに置き換えるなんて面倒ですし。
pandas-profilingの日本語文字化けへの対応方法は色々書かれていますが、情報が古かったり、matplotlibの日本語対応、seabornの日本語対応とほぼ同義なので逆に情報が混乱していたので纏まった記事も有用だろうと思って書きました。

1.当方の環境

Windows上のVScodeからDocker経由でUbuntu Serverに接続する形で作業してます。

Ubuntu Server 20.04

Docker:kaggle/python-gpu-build
python:3.7.10
matplotlib:3.4.2
seaborn:0.11.1
pandas-profiling:3.0.0

Windows10

VSCode + Jupyter

2.手順

以下手順です。基本的にVSCode側ですることはありません。

2.1.そもそも日本語フォントが入っているか確認。

Python matplotlib 使えるフォント一覧の作成

2.2入ってなければフォントをインストール

apt install -y fonts-ipafont

2.3pandas_profilingで当該フォントを扱えるようにする

/opt/conda/lib/python3.7/site-packages/pandas_profiling/visualisation/context.py (L40-L46)

        "font.sans-serif": [
            "Arial",
            "Liberation Sans",
            "Bitstream Vera Sans",
            "sans-serif",
            "IPAMincho",
        ],

2.4.pandas_profilingから呼ばれている(らしい)seabornでも扱えるようにする。

/opt/conda/lib/python3.7/site-packages/seaborn/rcmod.py (L205-L207)

        "font.family": ["IPAMincho"],
             "font.sans-serif": ["Arial", "DejaVu Sans", "Liberation Sans",
                                 "Bitstream Vera Sans", "sans-serif","IPAMincho"],

2.5.matplotlibのフォントキャッシュ削除

rm -f /root/.cache/matplotlib/fontlist-v330.json

2.6.Jupyter再起動

僕の場合はVSCode上ですのでノート開き直しでOKでした。

3.備考

フォントの設定なので

import seaborn as sns
sns.set(font='IPAMincho')

が要りそうなものですが、そもそもコードを弄っているのでデフォルトが入れ替わるのでしょうね、上記の設定無くても日本語表示されました。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up