More than 3 years have passed since last update.

etcd のバックアップ及びリストアについて

Last updated at 2021-02-19Posted at 2020-06-29

以下について試した時のメモ。

etcd-backup-and-restore.md

マスターは Udemy の講座で kodekloud.com 上に作成されたもので確認。

参考記事

etcd のバックアップ方法

2. Backup

etcd のバックアップは ectdctl を使って可能。
以下のコマンドの場合、

# 予め etcd の情報を確認する
$k get pods -n kube-system etcd-master

spec:
  containers:
  - command:
    - etcd
    - --advertise-client-urls=https://172.17.0.16:2379
    - --cert-file=/etc/kubernetes/pki/etcd/server.crt
    - --client-cert-auth=true
    - --data-dir=/var/lib/etcd
    - --initial-advertise-peer-urls=https://172.17.0.16:2380
    - --initial-cluster=master=https://172.17.0.16:2380
    - --key-file=/etc/kubernetes/pki/etcd/server.key
    - --listen-client-urls=https://127.0.0.1:2379,https://172.17.0.16:2379
    - --listen-metrics-urls=http://127.0.0.1:2381
    - --listen-peer-urls=https://172.17.0.16:2380
    - --name=master
    - --peer-cert-file=/etc/kubernetes/pki/etcd/peer.crt
    - --peer-client-cert-auth=true
    - --peer-key-file=/etc/kubernetes/pki/etcd/peer.key
    - --peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
    - --snapshot-count=10000
    - --trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
    image: k8s.gcr.io/etcd:3.3.15-0

・・・


# --help オプションを指定して何を指定すれば良いか確認
$ETCDCTL_API=3 etcdctl snapshot save --help
NAME:
        snapshot save - Stores an etcd node backend snapshot to a given file

USAGE:
        etcdctl snapshot save <filename>

GLOBAL OPTIONS:
      --cacert=""                               verify certificates of TLS-enabled secure servers usi
ng this CA bundle
      --cert=""                                 identify secure client using this TLS certificate fil
e
      --command-timeout=5s                      timeout for short running command (excluding dial tim
eout)
      --debug[=false]                           enable client-side debug logging
      --dial-timeout=2s                         dial timeout for client connections
  -d, --discovery-srv=""                        domain name to query for SRV records describing cluster endpoints
      --endpoints=[127.0.0.1:2379]              gRPC endpoints
      --hex[=false]                             print byte strings as hex encoded strings
      --insecure-discovery[=true]               accept insecure SRV records describing cluster endpoi
nts
      --insecure-skip-tls-verify[=false]        skip server certificate verification      --insecure-transport[=true]               disable transport security for client connections
      --keepalive-time=2s                       keepalive time for client connections
      --keepalive-timeout=6s                    keepalive timeout for client connections
      --key=""                                  identify secure client using this TLS key file
      --user=""                                 username[:password] for authentication (prompt if password is not supplied)
  -w, --write-out="simple"                      set the output format (fields, json, protobuf, simple
, table)

# Pod から取得した情報を利用し、バックアップを取得
$ETCDCTL_API=3 etcdctl --endpoints=https://127.0.0.1:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt \
     --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key \
     snapshot save /tmp/snapshot-pre-boot.db

取得後、念の為 etcdctl snapshot status コマンドを使って確認するとよりベターと思われる。

ETCDCTL_API=3 etcdctl --write-out=table snapshot status /tmp/snapshot-pre-boot.db
+----------+----------+------------+------------+
|   HASH   | REVISION | TOTAL KEYS | TOTAL SIZE |
+----------+----------+------------+------------+
| 5b70c885 |    24239 |       1343 |     3.4 MB |
+----------+----------+------------+------------+

etcd のリストア

大きく分けて2段階の作業が必要。

etcdctl snapshot restore コマンドの実行

先程取得した snapshot を etcdctl snapshot restore コマンドを使って任意のディレクトリにリストアする。

3. Restore ETCD Snapshot to a new folder

#  --help を使って何を指定できるか確認する
$ETCDCTL_API=3 etcdctl snapshot restore --help
NAME:        snapshot restore - Restores an etcd member snapshot to an etcd directory

USAGE:
        etcdctl snapshot restore <filename> [options]

OPTIONS:
      --data-dir=""                                             Path to the data directory
      --initial-advertise-peer-urls="http://localhost:2380"     List of this member's peer URLs to ad
master $ ETCDCTL_API=3 etcdctl snapshot restore --help
NAME:        snapshot restore - Restores an etcd member snapshot to an etcd directory
USAGE:
        etcdctl snapshot restore <filename> [options]
OPTIONS:      --data-dir=""                                             Path to the data directory
      --initial-advertise-peer-urls="http://localhost:2380"     List of this member's peer URLs to advertise to the rest of the cluster
      --initial-cluster="default=http://localhost:2380"         Initial cluster configuration for restore bootstrap
      --initial-cluster-token="etcd-cluster"                    Initial cluster token for the etcd cluster during restore bootstrap
      --name="default"                                          Human-readable name for this member
      --skip-hash-check[=false]                                 Ignore snapshot integrity hash value
(required if copied from data directory)
      --wal-dir=""                                              Path to the WAL directory (use --data
-dir if none given)

GLOBAL OPTIONS:
      --cacert=""                               verify certificates of TLS-enabled secure servers usi
ng this CA bundle
      --cert=""                                 identify secure client using this TLS certificate fil
e
      --command-timeout=5s                      timeout for short running command (excluding dial tim
eout)
      --debug[=false]                           enable client-side debug logging
      --dial-timeout=2s                         dial timeout for client connections
  -d, --discovery-srv=""                        domain name to query for SRV records describing clust
er endpoints      --endpoints=[127.0.0.1:2379]              gRPC endpoints
      --hex[=false]                             print byte strings as hex encoded strings      --insecure-discovery[=true]               accept insecure SRV records describing cluster endpoi
nts
      --insecure-skip-tls-verify[=false]        skip server certificate verification
      --insecure-transport[=true]               disable transport security for client connections
      --keepalive-time=2s                       keepalive time for client connections
      --keepalive-timeout=6s                    keepalive timeout for client connections
      --key=""                                  identify secure client using this TLS key file
      --user=""                                 username[:password] for authentication (prompt if password is not supplied)
  -w, --write-out="simple"                      set the output format (fields, json, protobuf, simple, table)


# Pod 情報より取得した情報を使いつつ、実行
$ETCDCTL_API=3 etcdctl --endpoints=https://127.0.0.1:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt \
     --name=master \
     --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key \
     --data-dir /var/lib/etcd-from-backup \
     --initial-cluster=master=https://127.0.0.1:2380 \
     --initial-cluster-token=etcd-cluster-1 \
     --initial-advertise-peer-urls=https://127.0.0.1:2380 \
     snapshot restore /tmp/snapshot-pre-boot.db

--data-dir については既に動いているのとは別のディレクトリを指定する。
--initial-cluster-token についてはあとで設定するが任意の値を指定するものだと思われる。

etcd Pod の変更

次に etcd の Pod の設定変更を行う。

今回の環境では etcd の Pod は Static Pod として動いていたのでその前提で作業を行う。
Static Pod で動いている場合、まずどこのパスに Static Pod のマニュフェストファイルがあるか確認が必要。
Static Pod のマニュフェストが配置されいる場所の確認方法は以下のドキュメントに記載あり。

Create static Pods

今回の環境での設定状況を確認してみる。

# kubelet のプロセスを確認し、設定ファイル(--config)のパスを確認
$ps aux |grep kubelet |grep configroot      2667  2.5  4.5 1345816 92684 ?       Ssl  07:20   0:55 /usr/bin/kubelet --bootstrap-kubecon
fig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/li
b/kubelet/config.yaml --cgroup-driver=systemd --network-plugin=cni --pod-infra-container-image=k8s.gc
r.io/pause:3.2 --resolv-conf=/run/systemd/resolve/resolv.confmaster

# config から staticPodPath を探す
$ cat /var/lib/kubelet/config.yaml |grep static
staticPodPath: /etc/kubernetes/manifests

# staticPodPath に etcd などのマニフェストファイルがあることを確認
master $ cat /etc/kubernetes/manifests/
cat: /etc/kubernetes/manifests/: Is a directorymaster $ cat /etc/kubernetes/manifests/
etcd.yaml                     kube-controller-manager.yaml
kube-apiserver.yaml           kube-scheduler.yaml

Static Pod の配置場所が分かったので/etc/kubernetes/manifests/etcd.yaml を編集する

4. Modify /etc/kubernetes/manifests/etcd.yaml

これによって Static Pod によって動いている Pod の再作成が実施され、データが復旧する。
変更点は以下の通り。

master $ diff etcd.yaml etcd-backup.yaml
17c17
<     - --data-dir=/var/lib/etcd-from-backup
---
>     - --data-dir=/var/lib/etcd
31d30
<     - --initial-cluster-token=etcd-cluster-1
46c45
<     - mountPath: /var/lib/etcd-from-backup
---
>     - mountPath: /var/lib/etcd
58c57
<       path: /var/lib/etcd-from-backup
---
>       path: /var/lib/etcd

なお、検証した所「--initial-cluster-token」はなくても動作したが --help の内容を見ると合ったほうが良いようにも見える。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up