Help us understand the problem. What is going on with this article?

fancyimputeをインストールしてみた

More than 1 year has passed since last update.

はじめに

fancyimpute
は行列補完アルゴリズム・補完アルゴリズムのライブラリである。

環境設定

Anacondaで仮想環境を作る
$conda create -n data_interpolation python=3.5 pandas

fancyimputeをインストールする前にインストールするもの

setting.py
install_requires=[
    'six',
    'knnimpute',
    # need at least 1.10 for np.multi_dot
    'numpy>=1.10',
    'scipy',
    # used by NuclearNormMinimization
    'cvxpy>=1.0.6',
    'scikit-learn>=0.19.1',
    # used by MatrixFactorization
    'keras>=2.0.0',
    'np_utils',
    'tensorflow',
]

versionなどに注意してインストールしていく。

cvxpyがインストールに手こずったので、コマンド記載(macの場合)
$conda install -c conda-forge lapack
$conda install -c cvxgrp cvxpy
※その他の方はこちら

fancyimputeのインストール

$ pip install fancyimpute

fancyimputeを使ってみる

KNN補完の場合の関数例

funcyimpute_interpolation.py
from fancyimpute import KNN

def funcyimpute_interpolation(input_df):
    # fancy impute removes column names.
    input_df_cols = list(input_df)
    input_df = pd.DataFrame(KNN(k=4).complete(input_df))
    input_df.columns = input_df_cols

fancyimputeの補完の種類

  • SimpleFill
    • 欠損部分を各列の平均値または中央値で置き換える
  • KNN
    • NN法による補完
  • SoftImpute
    • 行列補完
  • IterativeSVD 
    • 反復低ランク特異値分解による行列補完
  • IterativeImputer
    • ラウンドロビン方式で、他のfeatureの関数としてモデリングし、欠損を補完
  • MatrixFactorization
    • 行列の特異値分解を勾配降下法で求める(UをL1,VをL2)
  • NuclearNormMinimization
    • cvxpyを使用したEmmanuel CandesとBenjamin Rechtによる凸多面体最適化
    • ※行列が大きい場合は、処理が遅い
  • BiScaler
    • 二重正規化行列を得るための行/列平均と標準偏差の反復推定
Why do not you register as a user and use Qiita more conveniently?
  1. We will deliver articles that match you
    By following users and tags, you can catch up information on technical fields that you are interested in as a whole
  2. you can read useful information later efficiently
    By "stocking" the articles you like, you can search right away
Comments
Sign up for free and join this conversation.
If you already have a Qiita account
Why do not you register as a user and use Qiita more conveniently?
You need to log in to use this function. Qiita can be used more conveniently after logging in.
You seem to be reading articles frequently this month. Qiita can be used more conveniently after logging in.
  1. We will deliver articles that match you
    By following users and tags, you can catch up information on technical fields that you are interested in as a whole
  2. you can read useful information later efficiently
    By "stocking" the articles you like, you can search right away