More than 5 years have passed since last update.

The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks 【ABSTRACT】【論文ほぼ Google 翻訳自分用】

Last updated at 2020-03-07Posted at 2020-03-06

この記事は自分用のメモみたいなものです.
ほぼ Google 翻訳でお送りします.
間違いがあれば指摘していだだけると嬉しいです.

翻訳元
[The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks]
(https://arxiv.org/abs/1803.03635)

ABSTRACT

訳文

ニューラルネットワークの枝刈り技術により, トレーニング済みネットワークのパラメーター数を $90%$ 以上削減し, ストレージ要件を減らし, 精度を損なうことなく推論の計算パフォーマンスを改善できます. ただし, 現在の経験では, 枝刈りによって生成されるスパースなアーキテクチャを同じようなトレーニングパフォーマンスまで改善するようにスタートからトレーニングするのは困難です.

標準の枝刈り手法では, 初期値により効果的にトレーニングできるサブネットワークが自然に発見されることがわかります. これらの結果に基づいて, 宝くじ仮説を明確にします: 全結合でランダムに初期化されたフィードフォワードネットワークには, サブネットワーク (当選チケット) が含まれ, 単独でトレーニングされた場合, 同じイテレーション数で元のネットワークに匹敵するテスト精度に達します. 私たちが見つけた当選チケットは, 初期値宝くじに当選しました: それらの接続には, トレーニングを特に効果的にする初期重みがあります.

当選チケットを識別するアルゴリズムと, 宝くじ仮説とこれらの偶発的な初期値の重要性をサポートする一連の実験を紹介します. MNIST および CIFAR10 の全結合畳み込みフィードフォワードアーキテクチャのサイズの 10-20% 未満の当選チケットが常に見つかります. このサイズを超えると, 当選したチケットは元のネットワークよりも早く学習し, より高いテスト精度に達します.

原文

Neural network pruning techniques can reduce the parameter counts of trained networks by over 90%, decreasing storage requirements and improving computational performance of inference without compromising accuracy. However, contemporary experience is that the sparse architectures produced by pruning are difficult to train
from the start, which would similarly improve training performance.

We find that a standard pruning technique naturally uncovers subnetworks whose initializations made them capable of training effectively. Based on these results, we articulate the lottery ticket hypothesis: dense, randomly-initialized, feed-forward networks contain subnetworks (winning tickets) that—when trained in isolation—reach test accuracy comparable to the original network in a similar number of iterations. The winning tickets we find have won the initialization lottery: their connections have initial weights that make training particularly effective.

We present an algorithm to identify winning tickets and a series of experiments that support the lottery ticket hypothesis and the importance of these fortuitous initializations. We consistently find winning tickets that are less than 10-20% of the size of several fully-connected and convolutional feed-forward architectures for MNIST and CIFAR10. Above this size, the winning tickets that we find learn faster than the original network and reach higher test accuracy.

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up

The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks 【ABSTRACT】【論文 ほぼ Google 翻訳 自分用】

ABSTRACT

訳文

原文

The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks 【ABSTRACT】【論文ほぼ Google 翻訳自分用】