More than 3 years have passed since last update.

【自分メモ】Pythonのscikit-learnについて

Last updated at 2020-08-24Posted at 2018-05-20

随時追加削除予定

入門用

Qiita
scikit-learn（sklearn）の使い方
https://qiita.com/kenta1984/items/c2f3b2609071717dcf71
　　　サイキットラーン公式を引用しつつとりあえずサクッと分析ができる内容を説明
　　　スケーリングのための minimax_scale 関数や train_test_split 関数での分割
　　　正解率（accuracy）、適合率（precision）、再現率（recall）を出したい場合もある

scikit-learn で線形回帰 (単回帰分析・重回帰分析)
https://pythondatascience.plavox.info/scikit-learn/線形回帰

Qiita
ROC曲線とAUCについて定義と関係性をまとめたよ
https://qiita.com/koyamauchi/items/a2ed9f638b51f3b22cd6

前処理大全と言う本が良著らしい
https://www.amazon.co.jp/dp/B07C3JFK3V/ref=dp-kindle-redirect?_encoding=UTF8&btkr=1

不平衡データ

データサイエンティスト(仮)
Pythonでデータ分析：imbalanced-learnで不均衡データのサンプリングを行う
http://tekenuko.hatenablog.com/entry/2017/12/11/214522
　これはすごい！sklearn.datasets.make_classification関数でダミーデータを作る方法が書いてある！
　　　imbalanced-learnパッケージでSMOTEも使っている
　　　　　　imbalanced-learn API — imbalanced-learn 0.3.0 documentation

Qiita
[Python]不均衡データ分類問題に対する定番アプローチ：under sampling + baggingを実装したよ
https://qiita.com/nekoumei/items/6448a86a8d255619c4f4
　これもすごい。UnderSampling + Bagging を実装している
　　imblearnのBalancedBaggingClassifierが適切らしい。汎化性能高いぞ
　　　　TJOもRでやっている
　　　　　　　　https://tjo.hatenablog.com/entry/2017/08/11/162057

不均衡データにおけるsampling
https://qiita.com/shima_x/items/370587304ef17e7a61b8
　　　　アルゴリズムベースの方が、データレベルアプローチよりもロバストな結果になったそうです
　　　　アルゴリズムベースは、SVMに重み調整したものではないかと・・・

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up