2

More than 5 years have passed since last update.

[Survey] Kaggle - Quora 7位解法まとめ

Posted at 2017-09-13

Kaggle - Quora Question Pairs¹の7位解法²の調査記事です．

Title: [7位] 7-th solution overview
Author: aphex34
Discussion URL: https://www.kaggle.com/c/quora-question-pairs/discussion/34697

モデル構造

下記方式1〜5の予測結果(sigmoid)を特徴量としてXGBoostで予測
10-fold cvでout-of-fold prediction

方式1.

Reccurent Highway Network³
双方向LSTM
Siamese-Net⁴
GloVe 840Bで分散表現
89.1%の正答率

方式2.

方式1と基本同じだが，文字単位で適用
単語使わなくても方式1より1%程度性能が悪化するだけだった

方式3.

方式1と基本同じで，3-gramを利用

方式4.

Decomposable Attention⁵を利用．
かなり高速に収束(方式1は収束に2時間，こちらは20分程度)
異なるカーネルサイズで実験(論文³版, [2,3]版, [2,3,5]版)

方式5.

方式1〜4と同じだが，Attention Mechanismを利用していない．

自然言語処理(教師なしNLP特徴量)

Abhishek⁶, Mephistopheles⁷, the1owl⁸によるカーネル
WordNetの類似性⁹
GloVe840BでWord Mover's Distance¹⁰を算出
NERベースとPOSベースの特徴量(Stanford CoreNLPでタグを入手)

グラフ構造

上位陣と同じ

最適化

binary crossentropyのかわりにcontrastive lossを利用．

その他の工夫

Pseudo Labeling
- 最良のモデルを利用してテストセットを予測
- (train+test) x (train+test)のスパース正方行列$A$を構築
- 重複確率が閾値以上の全てのペアに対して，スパース正方行列$A$の対応する要素に1を挿入
- trainとtestの各ペアについてコサイン類似度を算出
予測値をそれぞれ1e-5と1-1e-5でクリッピング
ストップワードなし版は性能が落ちたので除外.

References

Kaggle, Quora Question Pairs, 2017. ↩
aphex34, 7-th solution overview, 2017. ↩
Zily et al., Recurrent Highway Networks, 2017. ↩
Dandekar, Semantic Question Matching with Deep Learning, 2017. ↩
Parikh et al., A Decomposable Attention Model for Natural Language Inference, 2016. ↩
Abhishek, Abhishek's features ↩
mephistopheies, 0.29936 solution ↩
the1owl, Matching ¿Que? for Quora - End to End 0.33719 PB ↩
sujitpal, nltk-examples ↩
Kusner et al., From Word Embeddings To Document Distances, 2015. ↩

2

Register as a new user and use Qiita more conveniently

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up

2