1.この記事の内容
Jupyter Notebookのファイルサイズに気づかず,git push
をするときに100MBを超えていて怒られたのでGit Large File Storageを活用して対策を行いましたので,手順を紹介します.
大まかな流れは,以下の通りです.
- 100MBを超えるファイルを登録する前のコミットまで戻る
- 今回ここは
git clone
しなおすことで対応しました
- 今回ここは
- Git LFSの設定を行う
- 100MB超のファイルをコミット・プッシュする
1-1.利用環境
- WSL2
2.背景
git push
時に遭遇したエラーは下記のとおりです.
開発途中の3世代分のnotebooks/016_Sample-Compare-Features-VGG16-PyTorch.ipynb
のファイルで問題が起きてしまっていました.
$ git push
Enumerating objects: 39, done.
Counting objects: 100% (39/39), done.
Delta compression using up to 16 threads
Compressing objects: 100% (28/28), done.
Writing objects: 100% (31/31), 237.08 MiB | 9.45 MiB/s, done.
Total 31 (delta 16), reused 0 (delta 0), pack-reused 0
remote: Resolving deltas: 100% (16/16), completed with 5 local objects.
remote: error: Trace: 918c861b20b3921941b018e3179b321c748a6d281d65dc2b0c7d558f685784a6
remote: error: See https://gh.io/lfs for more information.
remote: error: File notebooks/016_Sample-Compare-Features-VGG16-PyTorch.ipynb is 156.98 MB; this exceeds GitHub's file size limit of 100.00 MB
remote: error: File notebooks/016_Sample-Compare-Features-VGG16-PyTorch.ipynb is 154.33 MB; this exceeds GitHub's file size limit of 100.00 MB
remote: error: File notebooks/016_Sample-Compare-Features-VGG16-PyTorch.ipynb is 156.55 MB; this exceeds GitHub's file size limit of 100.00 MB
remote: error: GH001: Large files detected. You may want to try Git Large File Storage - https://git-lfs.github.com.
To https://github.com/ryoma-jp/machine_learning.git
! [remote rejected] main -> main (pre-receive hook declined)
error: failed to push some refs to 'https://github.com/ryoma-jp/machine_learning.git'
3.Git Large File Storageを用いた対策
3-1.Git Large File Storageのインストール
Installing on Linux using packagecloudを参考にGit Large File Storageをインストールします.
$ curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | sudo bash
Detected operating system as Ubuntu/jammy.
Checking for curl...
Detected curl...
Checking for gpg...
Detected gpg...
Detected apt version as 2.4.12
Running apt-get update... done.
Installing apt-transport-https... done.
Installing /etc/apt/sources.list.d/github_git-lfs.list...done.
Importing packagecloud gpg key... Packagecloud gpg key imported to /etc/apt/keyrings/github_git-lfs-archive-keyring.gpg
done.
Running apt-get update... done.
The repository is setup! You can now install packages.
$ sudo apt install git-lfs
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following packages were automatically installed and are no longer required:
linux-tools-5.15.0-60 linux-tools-5.15.0-60-generic
Use 'sudo apt autoremove' to remove them.
The following NEW packages will be installed:
git-lfs
0 upgraded, 1 newly installed, 0 to remove and 15 not upgraded.
Need to get 7420 kB of archives.
After this operation, 16.5 MB of additional disk space will be used.
Get:1 https://packagecloud.io/github/git-lfs/ubuntu jammy/main amd64 git-lfs amd64 3.5.1 [7420 kB]
Fetched 7420 kB in 1s (7406 kB/s)
Selecting previously unselected package git-lfs.
(Reading database ... 58152 files and directories currently installed.)
Preparing to unpack .../git-lfs_3.5.1_amd64.deb ...
Unpacking git-lfs (3.5.1) ...
Setting up git-lfs (3.5.1) ...
Git LFS initialized.
Processing triggers for man-db (2.10.2-1) ...
$ git lfs --version
git-lfs/3.5.1 (GitHub; linux amd64; go 1.21.8)
3-2.Git Large File Storageの設定
$ git clone https://github.com/ryoma-jp/machine_learning.git
$ cd machine_learning
$ git lfs track "*.ipynb"
Tracking "*.ipynb"
$ cat .gitattributes
*.ipynb filter=lfs diff=lfs merge=lfs -text
$ git add .gitattributes
$ git commit -m "add gitattributes"
[main 6373371] add gitattributes
1 file changed, 1 insertion(+)
create mode 100644 .gitattributes
3-3. 大容量ファイルのコミットとプッシュ
$ cd notebooks/
$ git add *.ipynb
$ git restore --staged 016_Sample-Compare-Features-VGG16-PyTorch.ipynb
$ git commit -m "add ipynb for git lfs"
$ git add utils/utils.py notebooks/016_Sample-Compare-Features-VGG16-PyTorch.ipynb
$ git commit -m "add sample code to compare features each preprocessing for input image"
(3世代分の変更を反映)
$ git push
Uploading LFS objects: 100% (25/25), 536 MB | 29 MB/s, done.
Enumerating objects: 77, done.
Counting objects: 100% (77/77), done.
Delta compression using up to 16 threads
Compressing objects: 100% (55/55), done.
Writing objects: 100% (59/59), 5.91 MiB | 4.10 MiB/s, done.
Total 59 (delta 18), reused 0 (delta 0), pack-reused 0
remote: Resolving deltas: 100% (18/18), completed with 5 local objects.
To https://github.com/ryoma-jp/machine_learning.git
3f31c39..87abb05 main -> main
3-4. Git LFSの状態を確認
$ git lfs ls-files -s
8900fc87c1 * notebooks/001_ImageClassification-CIFAR10-SimpleCNN-PyTorch.ipynb (32 KB)
04834a705a * notebooks/002_ImageClassification-Food101-SimpleCNN-PyTorch.ipynb (75 KB)
266a054efe * notebooks/003_ImageClassification-Food101-SimpleCNN-GradCAM_Heatmap-PyTorch.ipynb (88 KB)
91c038b405 * notebooks/004_ImageClassification-Food101-SimpleCNN-EigenCAM_Heatmap-PyTorch.ipynb (93 KB)
a3ea5a68ab * notebooks/005_ImageClassification-Food101-SimpleCNN-ParameterSpaceSaliency_Heatmap-PyTorch.ipynb (313 KB)
3e11f60d5e * notebooks/006_Algorithm_Welfords-Method-for-Computing-Variance.ipynb (8.5 KB)
96bba755e0 * notebooks/007_DomainAdaptation-OfficeHome-VGG16-PyTorch.ipynb (30 KB)
b683cf3833 * notebooks/008_Sample_ModelSeparate.ipynb (58 KB)
149caf1aa6 * notebooks/009_ImageClassification-Food101-VGG16-PyTorch.ipynb (73 KB)
4aaf91047c * notebooks/010_Sample-Compare-Weights-of-VGG16-PyTorch-and-Keras.ipynb (19 KB)
b8310b6d9c * notebooks/011_ImageClassification-COCO2014-VGG16-PyTorch.ipynb (2.4 MB)
a70440fb27 * notebooks/012_Sample-PyTorch-Learning-Rate-Scheduler.ipynb (332 KB)
b8d764646c * notebooks/013_Sample-Evaluation-of-ObjectDetection-SSD-PyTorch.ipynb (1.8 MB)
65e6696bb9 * notebooks/014-2_ImageClassification-CIFAR100-SimpleCNN-SGD-PyTorch.ipynb (4.4 MB)
e930ef0708 * notebooks/014-3_ImageClassification-CIFAR100-SimpleCNN-Momentum-PyTorch.ipynb (4.5 MB)
7ffa0ddedc * notebooks/014-4_ImageClassification-CIFAR100-SimpleCNN-Adagrad-PyTorch.ipynb (4.6 MB)
5dcfdefa20 * notebooks/014-5_ImageClassification-CIFAR100-SimpleCNN-RMSProp-PyTorch.ipynb (4.6 MB)
306fb4cd72 * notebooks/014-6_ImageClassification-CIFAR100-SimpleCNN-Adadelta-PyTorch.ipynb (4.5 MB)
3921aff1a8 * notebooks/014-7_ImageClassification-CIFAR100-SimpleCNN-Adam-PyTorch.ipynb (4.7 MB)
545d7ee7f2 * notebooks/014-8_ImageClassification-CIFAR100-SimpleCNN-AdamW-PyTorch.ipynb (4.8 MB)
b3e2076ffd * notebooks/014_Sample-Comparing-Optimizers.ipynb (58 KB)
6e091fe987 * notebooks/015_Sample-Feature-Extraction-VGG16-PyTorch.ipynb (8.2 MB)
ed954fa9d4 * notebooks/016_Sample-Compare-Features-VGG16-PyTorch.ipynb (165 MB)
4.さいごに
大容量ファイルのGit管理は,通常は行いませんが,Git LFSというサービスを知ったため,お試しも兼ねて今回はpushすることにしました.