More than 5 years have passed since last update.

mysqlのdumpをGitで保管してみる

Last updated at 2016-12-31Posted at 2014-01-13

と思ったのですが、推奨されていないですね。

1リポジトリ1GB以内、1ファイル100MB以内で管理されることが推奨
DB dump file, binaryファイル, 圧縮されたファイル,などのファイルの管理に適さない
https://help.github.com/articles/what-is-my-disk-quota

はじめに

dumpデータを毎日とり続けているが、圧縮しても1TBに到達しそう。
そんなに差分がないんじゃないのかと思い、Gitで管理、dumpデータは上書きしてGitにcommit
という運用を始めてみた。

gitリポジトリの準備

cd ~
mkdir dumpdir.git
cd dumpdir.git
git init
cd ~

diffについて

mysqldumpはskip-extended-insertオプションを付けると差分が見やすくていいです。
しかしリストアにかかる時間がskip-extended-insertがあると9倍くらいになるので注意。

改行するとdiffが容易

sed -i.bak -e 's@),(@)\n,(@g' dump.sql

改行するとdiffが容易、かつリストアにかかる時間がわずかに早い(数ms)

tar.gzからgitに変換

set -e

cd ~/archives/2013/

for dump in $(ls *.tar.gz | sed 's/.tar.gz//')
do
  echo $dump
  tar xzf ~/archives/2013/${dump} -C dumpdir.git/
  cd ~/dumpdir.git
  git add -A
  git commit -m "backup: ${dump}"
  cd ~
done

全部終わったらgit gc --prune=nowにて圧縮しておきましょう。

データサイズ例

形式	サイズ
tar.gz	3.6G
git gc前	4.5G
git gc後	516M

git gcがメモリ不足で失敗する場合

エラーメッセージ

$ git gc --auto
Auto packing the repository for optimum performance. You may also
run "git gc" manually. See "git help gc" for more information.
Counting objects: 10619, done.
Delta compression using up to 32 threads.
warning: suboptimal pack - out of memory 
fatal: Out of memory, malloc failed619)   
error: failed to run repack

threadを絞る

[pack]
	threads = 1
	windowMemory = 100m
	SizeLimit = 100m

windowMemory * 10(--windowの数)メモリ確保が必要

packオプションで上記を忘れてしまうと、メモリに格納しきれないほどの大きい.gitフォルダを
git gc --autoしようとするとサーバーがハングアップします。

サーバーrebootするのでgcだけにしておくこと。

- git gc --aggressive
- git gc --prune=now
+ git gc

年度別にgitリポジトリ作るといい

やり直したい場合

revを../name.tar.gzに退避

a() {
  git archive -o ../$name.tar.gz $rev
} 

a 312f68 2014-01-01
a 22cc8f 2014-01-02
a c4bcdc 2014-01-03
...

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up