More than 3 years have passed since last update.

Adversarial Attack (敵対的攻撃) に関するリポジトリをまとめてみた

Last updated at 2022-06-07Posted at 2022-05-29

自分用のメモを兼ねて、Adversarial Attackに関連するリポジトリをまとめてみました。用途に応じてざっくり分類しました。

Trusted-AI/adverasarial-robustness-toolbox
- 今回調べた中では最も大規模でドキュメントも充実している印象。
- Evasion, Poisoning, Extraction, Inference attackに対する評価を行うことができる。
Harry24k/adversarial-attacks-pytorch
- white-box attackに特化した印象。
bethgelab/foolbox
- Pytorch, tensorflow, jaxなど多くのモジュールに対応
google-research/robustness_metrics
- 以下の三つの観点からモデルを評価できる。
  1. out-of-distribution generalization (e.g. a non-expert human would be able to classify similar objects, but possibly changed viewpoint, scene setting or clutter).
  2. stability (of the prediction and predicted probabilities) under natural perturbation of the input.
  3. uncertainty (e.g. assessing to which extent the probabilities predicted by a model reflect the true probabilities)
BorealisAI/advertorch
- white-box attackに特化した印象。L-BFGSが実装されている。
robustness-gym/robustness-gym
ppwwyyxx/Adversarial-Face-Attack
- 顔認証システムに対する攻撃
thunlp/OpenAttack
- 言語モデルに対する攻撃手法が複数実装されている
QData/TextAttack
- 言語モデルに対する攻撃・データ水増し手法など

Robustbench/robustbench
- モデルのロバスト性を競う大規模なベンチマーク
MadryLab/robustness
- 敵対的学習されたモデルに対する攻撃成功率を競うベンチマーク（pre-trainedモデルが入手できる）
hendrycks/robustness
- Benchmarking Neural Network Robustness to Common Corruptions and Perturbations