0. はじめに
EC2がなにかと増えてきて、EBSのスナップショットの容量、個数が増えてきて、Human Errorが出る前に自動化しときたいなーと思っていた所、こちらの記事(http://qiita.com/MasaoDX/items/39624823fff337d08e6f) を参考にさせていただきました。
AWSCLIで
1.バックアップターゲットEBSの検索
2.Snapshot作成
3.保持期間,VolumeID,device name等の情報をsnapshotに付加
4.消してもOKなsnapshotを検索して削除
5.snapshotをとったvolumeをAWSSNSで通知
0. System Requirement
- bashが動く環境
- AWSCLI https://aws.amazon.com/jp/cli/
- jq https://stedolan.github.io/jq/
- 適切なIAM Role
1. IAM Roleの準備
Snapshotを作ったり消したりするためだけのIAM Roleを設定します。権限を最低限に抑えるのがお作法だと聞いております。
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "CreateDeleteSnapshot",
"Effect": "Allow",
"Action": [
"ec2:Describe*",
"ec2:CreateSnapshot",
"ec2:DeleteSnapshot",
"ec2:CreateImage",
"ec2:CreateTags",
"ec2:DescribeImages",
"ec2:DeregisterImage"
],
"Resource": [
"*"
]
}
]
}
2. AWS CLI実行用のInstance準備
2.1 Amazon Linuxの起動 with IAM Role
- IAM Roleに1で作成したIAM RoleをつけてInstanceを起動します
IAM Roleの割り当ては初回起動時にしかできないのでご注意を- IAM Roleは起動後のInstanceにも付与できるようになりました。よかった。https://aws.amazon.com/blogs/security/new-attach-an-aws-iam-role-to-an-existing-amazon-ec2-instance-by-using-the-aws-cli/
2.2 jq install
sudo yum install jq
3. EBS側の準備
3.1 Tag付け
- バックアップのターゲットになるEBSにtagをつけます。このタグで検索してバックアップを実行します。
Key = backup_target
Value = true
4. 実行するシェルスクリプト
4.1 変数定義
DATE_CURRENT=`date +%Y-%m-%d`
TIME_CURRENT=`date +%Y%m%d%H%M%S`
PURGE_AFTER_DAYS=0
PURGE_AFTER=`date -d +${PURGE_AFTER_DAYS}days -u +%Y-%m-%d`
BACKUP_KEY='backup_target'
AWS_REGION='us-west-2'
SNS_REGION='us-west-2'
SNS_TOPIC_NAME='MySNSTopic'
SNS_TOPIC_ARN=`aws --region ${SNS_REGION} sns list-topics | jq -r .Topics[].TopicArn | grep ${SNS_TOPIC_NAME}`
ERROR_VOL='./error_vol.txt'
SUCCESS_VOL='./succes_vol.txt'
4.2 Snapshotの作成
# 1-1. listing-up backup target volume-id
VOLUMES=`aws --region ${AWS_REGION} ec2 describe-volumes --filters Name=tag:${BACKUP_KEY},Values=true | jq -r '.Volumes[].Attachments[] | .VolumeId'`
for VOLUME in ${VOLUMES}; do
BACKUP=`aws --region ${AWS_REGION} ec2 describe-tags --filters "Name=resource-type,Values=volume" "Name=resource-id,Values=${VOLUME}" "Name=key,Values=${BACKUP_KEY}" | jq -r .Tags[].Value`
INSTANCE_ID=`aws --region ${AWS_REGION} ec2 describe-volumes --filters Name=volume-id,Values=${VOLUME} | jq -r '.Volumes[].Attachments[] | .InstanceId'`
Device_Name=`aws --region ${AWS_REGION} ec2 describe-volumes --filters Name=volume-id,Values=${VOLUME} | jq -r '.Volumes[].Attachments[] | .Device'`
INSTANCE_NAME=`aws --region ${AWS_REGION} ec2 describe-tags --filters Name=resource-id,Values=${INSTANCE_ID} Name=key,Values=Name | jq -r .Tags[].Value`
# 1-2.verify backup target volume with tag
if [ "${BACKUP}" == "true" ]; then
# 1-3.create snapshot
SNAPSHOT_ID=`aws --region ${AWS_REGION} ec2 create-snapshot --volume-id ${VOLUME} --description "${INSTANCE_NAME} ${INSTANCE_ID} ${Device_Name} ${VOLUME} ${TIME_CURRENT}" | jq -r '.SnapshotId'`
if [ -z "${SNAPSHOT_ID}" ]; then
echo ${VOLUME} >> "${ERROR_VOL}"
else
echo ${VOLUME} >> "${SUCCESS_VOL}"
fi
# 1-4.adding tag for searching
aws --region ${AWS_REGION} ec2 create-tags --resources ${SNAPSHOT_ID} --tags Key=PurgeAllow,Value=true Key=PurgeAfter,Value=${PURGE_AFTER} Key=PurgeTarget,Value=${BACKUP_KEY} Key=Name,Value="${INSTANCE_NAME} ${INSTANCE_ID} ${Device_Name} ${VOLUME} ${TIME_CURRENT}"
fi
done
4.3 Snapshotの削除
# 2-1.find delete target snapshot volume with tag
SNAPSHOT_PURGE_ALLOWED=`aws --region ${AWS_REGION} ec2 describe-tags --filters "Name=resource-type,Values=snapshot" "Name=key,Values=PurgeAllow" | jq -r .Tags[].ResourceId`
# 2-2. verify the delete target volumes with 2 tags
for SNAPSHOT_ID in ${SNAPSHOT_PURGE_ALLOWED}; do
PURGE_AFTER_DATE=`aws --region ${AWS_REGION} ec2 describe-tags --filters "Name=resource-type,Values=snapshot" "Name=resource-id,Values=${SNAPSHOT_ID}" "Name=key,Values=PurgeAfter" | jq -r .Tags[].Value`
PURGE_TARGET_CHECK=`aws --region ${AWS_REGION} ec2 describe-tags --filters "Name=resource-type,Values=snapshot" "Name=resource-id,Values=${SNAPSHOT_ID}" "Name=key,Values=PurgeTarget" | jq -r .Tags[].Value`
if [ "${PURGE_TARGET_CHECK}" == "${BACKUP_KEY}" ]; then
if [ -n ${PURGE_AFTER_DATE} ]; then
DATE_CURRENT_EPOCH=`date -d ${DATE_CURRENT} +%s`
PURGE_AFTER_DATE_EPOCH=`date -d ${PURGE_AFTER_DATE} +%s`
if [[ ${PURGE_AFTER_DATE_EPOCH} < ${DATE_CURRENT_EPOCH} ]]; then
# 2-2.judge the target and delete the snapshots
aws --region ${AWS_REGION} ec2 delete-snapshot --snapshot-id ${SNAPSHOT_ID}
fi
fi
fi
done
4.3 SNSで結果報告
#3-1. send notification mail
if [ -e "${ERROR_VOL}" ]; then
# create message header
cat << EOF >> ${ERROR_VOL}
There are EBS volumes failed to take snapshots on above volumes, please contact to AWS admin
EOF
date >> ${ERROR_VOL}
aws --region ${SNS_REGION} sns publish --topic-arn ${SNS_TOPIC_ARN} --message file://tokyo_error_vol.txt --subject "[ERROR] EBS Backup"
rm -f ${ERROR_VOL}
elif [ -e "${SUCCESS_VOL}" ]; then
# create message header
cat << EOF >> ${SUCCESS_VOL}
Taking EBS snapshot was succesfully completed on above volumes. Have a nice day!
EOF
date >> ${SUCCESS_VOL}
aws --region ${SNS_REGION} sns publish --topic-arn ${SNS_TOPIC_ARN} --message file://tokyo_succes_vol.txt --subject "[SUCCESS] EBS Backup"
rm -f ${SUCCESS_VOL}
else
aws --region ${SNS_REGION} sns publish --topic-arn ${SNS_TOPIC_ARN} --message "There are no target EBS Volumes" --subject "[WARN] EBS Backup"
rm -f ${SUCCESS_VOL}
fi
5. cronで定期実行
書いたシェルスクリプトをcronで定期実行してあげる。例えばこんな感じに。
00 01 * * 0 ~/backup.sh >> /var/log/cron.log