More than 3 years have passed since last update.

Selective Classification for Deep Neural Networks【1 Introduction】【論文 DeepL 翻訳】

論文読み

Last updated at 2020-11-09Posted at 2020-11-09

この記事は自分用のメモみたいなものです.
ほぼ DeepL 翻訳でお送りします.
間違いがあれば指摘していだだけると嬉しいです.

翻訳元
Selective Classification for Deep Neural Networks
Author: Yonatan Geifman, Ran El-Yaniv

前: 【Abstract】
次: 【2 Problem Setting】

1 Introduction

訳文

自己認識はまだ捉えどころのない, 定義の難しい概念である. しかし, より簡単に把握することができるものとして, 初歩的な自己認識の一種である, 自分が知らないことを知るという能力があり, それによって自分を賢くすることができる. 機械学習でこのような能力を扱うサブフィールドは, 選択的予測 (リジェクトオプション付き予測とも呼ばれる) と呼ばれ, 60 年前から存在している [1, 5]. 選択的予測の主な動機は, 可能な限り高いカバー率を維持しながら, 疑わしい場合には予測を控えることで誤り率を減らすことである. 選択的予測の最終的な表現は, 分類器のカバレッジを可能な限り高く保ちながら, (高い確率で保証されるべき) 望ましい真のエラー率を正確に制御できる "ダイヤル" を備えた分類器である.
(深層) 予測モデルによって実行される多くの現在および将来のタスクは, 高品質の選択的予測によって劇的に向上させることができる. 例えば, 自律走行を考える. AI が超人的な存在である "特異点" の出現に頼ることはできないので, 時にはエラーを起こす標準的な機械学習で何とかしなければならない. しかし, 私たちの深層自律運転ネットワークが, ある状況でどのように対応すべきかわからないことを知っていて, 事前に自分自身を切り離し, 人間の運転手 (できればその時間に眠っていないことを望む) にアラートを出して引き継ぐことができるとしたらどうだろうか? 効果的な選択的予測から大きな利益を得ることができるミッションクリティカルなアプリケーションは, 他にもたくさんある.
棄却オプションに関する文献は非常に幅広く, 主に様々な仮説クラスや学習アルゴリズム (SVM, ブースティング, 最近傍など) の棄却メカニズムについて議論されている [8, 13, 3]. 棄却オプションはニューラルネットワーク (NNs) の文脈ではほとんど議論されておらず, ディープNNs (DNNs) では今のところ考慮されていない. 既存の NN の研究では、コストベースの拒絶モデル[2, 4]を考えており、誤分類と棄権のコストを指定し、それらのコストに対して拒絶メカニズムを最適化する必要がある. 提案された分類メカニズムは, ソフトマックス層の最大ニューロン応答に厳選された閾値を適用することに基づいている. このメカニズムをソフトマックスレスポンス (SR) と呼ぶことにする. このように, このモデルは, 関連するコストを定量化することができれば非常に便利だが, 多くのアプリケーションでは, 意味のあるコストを推論するのは難しい. (自動運転システムを解除するための適切な拒絶/誤分類コストを設定しようとしていることを想像すればよい). ここでは, [5] で議論されている選択的分類のための代替的なリスクカバレッジビューを検討する.
アンサンブル技術は, 選択的 (および信頼度の高い) 予測のために検討されてきたが, ここでは一般的にアンサンブル統計量に基づいた拒絶メカニズムが用いられている [18, 7]. しかし, このような技術は, 十分に多くのアンサンブルメンバーを訓練するのに非常にコストがかかる DNNs のコンテキストでは, 現在のところ実現が困難である. 最近, GalとGhahramani [9] は, 複数のアンサンブルメンバーを訓練する必要がない, DNNs の不確かさを測定するためのアンサンブル的な手法を提案した. 彼らの方法は, フォワードパスの複数のドロップアウトアプリケーションをサンプリングして, ネットワーク予測をランダムに摂動させることで動作する. このモンテカルロドロップアウト (MC-dropout) 技術は, 選択的予測の文脈では言及されていなかったが, ここで議論するように, 閾値を用いた実行可能な選択的予測手法として直接適用することができる.
この論文では, 分類タスクを考え, 選択的分類器 $(f, g)$ を学習することを目標としている. ここで, $f$ は標準分類器, $g$ は棄却関数である. 選択的分類器は, 真のリスクに対して完全に保証された制御を可能にしなければならない. 理想的な方法は, プロダクションのサンプルを最適なカバー率で任意のレベルのリスクで分類できることである. この最適性能は, ペア $(f, g)$ が一緒に訓練された場合にのみ得られると仮定するのが妥当である. しかし, 最初のステップとして, (深い) ニューラル分類器 $f$ がすでに与えられており, 我々の目標は, 高確率で所望のエラー率を保証する棄却関数 $g$ を学習するという, より単純な設定を考える. この目的のために, 上述の 2 つの既知の棄却法 (SRとMC-dropout) を考慮し, 望ましいリスクを保証する適切な閾値を選択する学習法を考案する. 与えられた分類器 $f$, 信頼度 $ \delta $, 所望のリスク $ r^* $に対して, 我々の手法は, 少なくとも $ 1 - \delta $ の確率でテスト誤差が $ r^* $ よりも大きくならない選択的分類器 $(f, g)$ を出力する.
よく知られている VGG-16 アーキテクチャを用いて, CIFAR-10, CIFAR-100, ImageNet (ImageNet では RESNET-50 アーキテクチャも適用) に我々の手法を適用した. その結果, SR とドロップアウトの両方が非常に効果的な選択的分類につながることがわかった. CIFAR のデータセットでは, これら2つのメカニズムはほぼ同じ結果を得ている. しかし, ImageNet では, よりシンプルな SR メカニズムの方が有意に優れている. さらに重要なことは, ほぼすべての望ましいリスクレベルが, 驚くほど高いカバレッジで保証されることを示していることである. 例えば, ImageNet の top-5 の分類において, 前例のない 2% の誤差が 99.9% の確率で保証され, ほぼ 60% のテストカバレッジで保証できることを示している.

原文

While self-awareness remains an illusive, hard to define concept, a rudimentary kind of selfawareness, which is much easier to grasp, is the ability to know what you don’t know, which can make you smarter. The subfield dealing with such capabilities in machine learning is called selective prediction (also known as prediction with a reject option), which has been around for 60 years [1, 5]. The main motivation for selective prediction is to reduce the error rate by abstaining from prediction when in doubt, while keeping coverage as high as possible. An ultimate manifestation of selective prediction is a classifier equipped with a “dial” that allows for precise control of the desired true error rate (which should be guaranteed with high probability), while keeping the coverage of the classifier as high as possible.
Many present and future tasks performed by (deep) predictive models can be dramatically enhanced by high quality selective prediction. Consider, for example, autonomous driving. Since we cannot rely on the advent of “singularity”, where AI is superhuman, we must manage with standard machine learning, which sometimes errs. But what if our deep autonomous driving network were capable of knowing that it doesn’t know how to respond in a certain situation, disengaging itself in advance and alerting the human driver (hopefully not sleeping at that time) to take over? There are plenty of other mission-critical applications that would likewise greatly benefit from effective selective prediction.
The literature on the reject option is quite extensive and mainly discusses rejection mechanisms for various hypothesis classes and learning algorithms, such as SVM, boosting, and nearestneighbors [8, 13, 3]. The reject option has rarely been discussed in the context of neural networks (NNs), and so far has not been considered for deep NNs (DNNs). Existing NN works consider a cost-based rejection model [2, 4], whereby the costs of misclassification and abstaining must be specified, and a rejection mechanism is optimized for these costs. The proposed mechanism for classification is based on applying a carefully selected threshold on the maximal neuronal response of the softmax layer. We that call this mechanism softmax response (SR). The cost model can be very useful when we can quantify the involved costs, but in many applications of interest meaningful costs are hard to reason. (Imagine trying to set up appropriate rejection/misclassification costs for disengaging an autopilot driving system.) Here we consider the alternative risk-coverage view for selective classification discussed in [5].
Ensemble techniques have been considered for selective (and confidence-rated) prediction, where rejection mechanisms are typically based on the ensemble statistics [18, 7]. However, such techniques are presently hard to realize in the context of DNNs, for which it could be very costly to train sufficiently many ensemble members. Recently, Gal and Ghahramani [9] proposed an ensemble-like method for measuring uncertainty in DNNs, which bypasses the need to train several ensemble members. Their method works via sampling multiple dropout applications of the forward pass to perturb the network prediction randomly. While this Monte-Carlo dropout (MC-dropout) technique was not mentioned in the context of selective prediction, it can be directly applied as a viable selective prediction method using a threshold, as we discuss here.
In this paper we consider classification tasks, and our goal is to learn a selective classifier $(f, g)$, where $f$ is a standard classifier and $g$ is a rejection function. The selective classifier has to allow full guaranteed control over the true risk. The ideal method should be able to classify samples in production with any desired level of risk with the optimal coverage rate. It is reasonable to assume that this optimal performance can only be obtained if the pair $(f, g)$ is trained together. As a first step, however, we consider a simpler setting where a (deep) neural classifier $f$ is already given, and our goal is to learn a rejection function $g$ that will guarantee with high probability a desired error rate. To this end, we consider the above two known techniques for rejection (SR and MC-dropout), and devise a learning method that chooses an appropriate threshold that ensures the desired risk. For a given classifier $f$, confidence level $ \delta $, and desired risk $ r^* $ , our method outputs a selective classifier $(f, g)$ whose test error will be no larger than $ r^* $ with probability of at least $1-\delta$.
Using the well-known VGG-16 architecture, we apply our method on CIFAR-10, CIFAR-100 and ImageNet (on ImageNet we also apply the RESNET-50 architecture). We show that both SR and dropout lead to extremely effective selective classification. On both the CIFAR datasets, these two mechanisms achieve nearly identical results. However, on ImageNet, the simpler SR mechanism is significantly superior. More importantly, we show that almost any desirable risk level can be guaranteed with a surprisingly high coverage. For example, an unprecedented 2% error in top-5 ImageNet classification can be guaranteed with probability 99.9%, and almost 60% test coverage.

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up