More than 3 years have passed since last update.

Radiomics heat mapを描こう

Posted at 2022-05-17

あらまし

Radiomicsは、たくさんの画像特徴を機械学習の説明変数として利用して、予測モデルを作成することがよく行われます。
例えば、分類問題であれば、説明変数にクラス間で差があれば、分類精度の高い予測モデルが作成できます。
予測モデルを作成する前に、このような説明変数の差を大まかに確認したり、グループにまとめることができれば、効率的に機械学習を進めることができます。
このようなグループの差をわかりやすく説明するために、ヒートマップが用いられることがあります。

データセット

RSNA 1p19q codel datasetからRadiomicsJを用いて抽出されたRadiomics特徴
LABELは、0: 共欠失なし,1: 共欠失あり

読み込んで、標準化までしておきます。
（標準化はseabornでもできるのですが、ここは一例として先に標準化します）

# データ（CSV）の場所　（公開リンク）
# https://drive.google.com/file/d/1FNsZcDGoiPdGE5K4Iyb2kA51tREhWZB5/view?usp=sharing

import seaborn as sns; sns.set_theme(color_codes=True)
import pandas as pd
import numpy as np

# リンクのCSVをColabへアップロードしてから
features = pd.read_csv("RadiomicsJ_result_weka.csv", dtype=np.float64)
# NaN（0 dividedとなってしまった結果）がStringで入っているので、0に置き換えます。
features = features.fillna(0) 
# クラスラベルで行を並び替えます
sorted_features = features.sort_values(by=['LABEL'], ignore_index=True) # 0: no codel, 1: codel
# ラベルをデータセットから分けます
codel = sorted_features.pop("LABEL")

# standardize
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
scaler.fit(sorted_features)
std_ds = scaler.transform(sorted_features)
std_ds = pd.DataFrame(std_ds, columns=sorted_features.columns)

ヒートマップ

まずは単純なヒートマップです。

# シンプルなヒートマップ
ax = sns.heatmap(std_ds, vmin=-1.5, vmax=1.5)
ax.set_xlabel("features")
ax.set_ylabel("subjects")
ax.set_title("Radiomic features heatmap")

共欠失有りのほうが、T2W画像の信号強度関連の特徴が低い傾向があるように見えます。

クラスラベル付きのヒートマップ

クラス間の差を見やすくしたヒートマップです。
クラスごとのLUTを作成してセットします。

# lutの準備
# 赤: 0, 青: 1
lut = dict(zip(codel.unique(), "rb"))
# print(lut)
row_colors = codel.map(lut)

# simple heat map with class label
g = sns.clustermap(std_ds,
                   cmap="viridis",
                   # Turn off the clustering
                   row_cluster=False, col_cluster=False,
                   # Add colored class labels
                   row_colors=row_colors, col_colors=None,
                   # Make the plot look better when many rows/cols
                   linewidths=0,
                   vmin=-1.5, vmax=1.5,
                   xticklabels=False, # features list
                   yticklabels=False # subjects
                   )
g.ax_heatmap.set_title('Cluster map')
g.ax_heatmap.set_xlabel('features')
g.ax_heatmap.set_ylabel('subjects')

クラスターマップ

クラスタリングを組み合わせたマップです。

g = sns.clustermap(std_ds, row_colors=row_colors, vmin=-1.5, vmax=1.5)
# g.fig.suptitle('Cluster map') # 重なってしまうので一旦コメントアウト
g.ax_heatmap.set_xlabel('features')
g.ax_heatmap.set_ylabel('subjects')

ここで、うまくクラスがまとまれば、いいデータなのですが、クラスラベルのばらつきを見てみると、このデータでは少し難しいようだ、ということがわかります。
クラスターがうまく作れるか、各特徴はどう相関しそうか、グループはどうわけられそうかなど、おおまかに俯瞰できます。

References

seaborn.heatmap: https://seaborn.pydata.org/generated/seaborn.heatmap.html
seaborn.clustermap: https://seaborn.pydata.org/generated/seaborn.clustermap.html
クラスラベルを追加する: https://stackoverflow.com/questions/27988846/how-to-express-classes-on-the-axis-of-a-heatmap-in-seaborn
seaborn の clustermap をちゃんと理解する:https://nykergoto.hatenablog.jp/entry/2018/11/19/seaborn_%E3%81%AE_clustermap_%E3%82%92%E3%81%A1%E3%82%83%E3%82%93%E3%81%A8%E7%90%86%E8%A7%A3%E3%81%99%E3%82%8B
RadiomicsJ: https://sites.google.com/vis-ionary.com/public/ij-plugin_radiomicsj

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up