More than 1 year has passed since last update.

EDB Cloud Native PostgreSQL on OpenShift 高可用性検証

Last updated at 2023-01-17Posted at 2022-01-04

1 はじめに

Public CloudのRDBソリューションの選択肢としてはマネージドDB(RDSなど)が一般的ですが、ハイブリッド/マルチクラウド環境でのRDBソリューションとして、EDB PostgreSQLのコンテナ版(Cloud Native PostgreSQL)がエンプラの高可用性要件を満たせるかを検証してみました。

2 環境

IBM Cloudの東京リージョンの3AZを使ったマネージドOpenShift(ROKS)v4.8を使います。
EDB PortgreSQLクラスタの各インスタンスはBlockStorageを動的プロビジョニングし、バックアップ用途にICOS(IBM Cloud Object Storage)を利用します。

3 導入

EDBのCloud Native PostgreSQL(CNP)はOperator(CapabilityLevel:5)が提供されています。
OpenShift WebコンソールのOperatorHubから入れていきたいと思います。

3.1 Operatorの導入

OperatorHubからEDB Cloud Native PostgreSQLを検索してクリックします

[Install]

3.2 CNP PostgreSQLクラスタの構成

Operatorのインストールが完了したら、Installed Operatorsの[Cloud Native PostgreSQL]画面から、[Cluster]のCreate instanceをしていきます。

任意の名前とインスタンス数（ここでは3AZの3worker nodeに配置するため3インスタンス)を指定します。

[Storage]を開いて、任意のサイズとStorage Classを選択します。

[Backup]を開いて、事前にICOS側で作成したバケットを指定します。
Destination Path: s3://<bucket>/
Object Storage Endpoint: https://<bucket>.<ICOS ednpoint URL>
※バックアップを取得せずに作成すると、ノード障害時にそのノード上にいたインスタンス(Pod)がPostgreSQLクラスタに復旧できません。

[S3 Credentials]にはICOS側のCredentialを元にSecretを作成し、KeyとIDを設定します。

[Pod Affinity]を開いて、極力AZで分散するように、Topology Keyにkubernetes.io/zoneを設定しておきます。

その他設定はこの検証ではデフォルトのまま[Create]します。

数分でPosrgreSQLクラスタができました。
インスタンスのPodが3つと、PVCが3つできています。

ServiceやSecretも複数できてました。

KubernetesリソースのServiceが、Read-write用やRead-Only用など複数提供されており、Selectorで割り振り先のインスタンスPodを識別しているようです。

参考：https://www.enterprisedb.com/docs/kubernetes/cloud_native_postgresql/architecture/

CNPというCUIも提供されており、PosrgreSQLクラスタのステータスが確認できます。
ここでは、cluster-edb-project1-k-1というインスタンスPodがPrimaryで、2,3のPodがReplicaであることが確認できます。

% kubectl cnp status cluster-edb-project1-k
Cluster in healthy state
Name:              cluster-edb-project1-k
Namespace:         edb-poc
PostgreSQL Image:  quay.io/enterprisedb/postgresql:14.1
Primary instance:  cluster-edb-project1-k-1
Instances:         3
Ready instances:   3
Current Timeline:  1
Current WAL file:  00000001000000000000000F

Continuous Backup status
First Point of Recoverability:  Not Available
Working WAL archiving:          OK
Last Archived WAL:              00000001000000000000000F   @   2022-01-04T10:50:52.760873Z

Instances status
Manager Version  Pod name                  Current LSN  Received LSN  Replay LSN  System ID            Primary  Replicating  Replay paused  Pending restart  Status
---------------  --------                  -----------  ------------  ----------  ---------            -------  -----------  -------------  ---------------  ------
1.11.0           cluster-edb-project1-k-1  0/10000000                             7045121885460729875  ✓        ✗            ✗              ✗                OK
1.11.0           cluster-edb-project1-k-2               0/10000000    0/10000000  7045121885460729875  ✗        ✓            ✗              ✗                OK
1.11.0           cluster-edb-project1-k-3               0/10000000    0/10000000  7045121885460729875  ✗        ✓            ✗              ✗                OK

また、ログのロストが許容されない高可用性構成を想定し、PostgreSQLクラスタ内のレプリケーションは同期レプリケーションのモードに設定します。

参考：https://www.enterprisedb.com/docs/postgres_for_kubernetes/latest/replication/#synchronous-replication

4 バックアップ取得

続いてバックアップの取得を試します。
Operatorから、One timeの[Backups]と[Scheduled Backups]の機能が提供されています。
ここではOne timeを試します。
Installed Operatorsの[Cloud Native PostgreSQL]画面から、[Backups]のCreater instanceをしていきます。

任意のバックアップ名、[Cluster Name]には先程作成したPosrgreSQLクラスタ名を入れ、[Create]します。

EventsからBackup completedを確認します。データも入ってないのですぐ終わりました。

ICOS側のバケットを確認したところ、バックアップデータらしきものが入っているように見えます。

5 可用性検証

それでは、EDB Cloud Native PostgreSQL(CNP)の可用性の検証を行います。
同じOpenShiftクラスタにデプロイしたサンプルアプリから、rwのserviceを利用し（つまりRole:Primaryインスタンスにのみアクセス)、DB更新(PUT)apiを1秒間隔で実行します。

5.1 Primary Pod障害

PrimaryインスタンスPodがcluster-edb-project1-k-1であることを確認したので、このPodをdeleteしてみます。

% oc get pod
NAME                                                             READY   STATUS    RESTARTS   AGE
cluster-edb-project1-k-1                                         1/1     Running   0          11d
cluster-edb-project1-k-2                                         1/1     Running   0          11d
cluster-edb-project1-k-3                                         1/1     Running   0          11d
postgresql-operator-controller-manager-1-11-0-5bbf8b54bc-dpjwn   1/1     Running   0          18d
% date +%Y%m%d%H%M%S; oc delete pod cluster-edb-project1-k-1
20220104222550
pod "cluster-edb-project1-k-1" deleted

別Terminalで1秒間隔アプリアクセスを確認しましたが、1-2秒の停止でアクセス続行できていました。

 % while true; do curl -X PUT -H "Content-Type: application/json" -d 'test11111' http://edb-sample-edb-poc.roks-xxx-0000.jp-tok.containers.appdomain.cloud/; ; date +%Y%m%d%H%M%S; sleep 1; done
〜
true20220104222247
true20220104222248
true20220104222249
true20220104222250   # primary pod delete実行
true20220104222252   # アクセス復活
true20220104222253
true20220104222254
true20220104222255
〜

インスタンスのPrimaryがcluster-edb-project1-k-2にFail Overしていることを確認します。

% kubectl cnp status cluster-edb-project1-k
Cluster in healthy state
Name:              cluster-edb-project1-k
Namespace:         edb-poc
PostgreSQL Image:  quay.io/enterprisedb/postgresql:14.1
Primary instance:  cluster-edb-project1-k-2
Instances:         3
Ready instances:   3
Current Timeline:  2
Current WAL file:  000000020000000000000017

Continuous Backup status
First Point of Recoverability:  Not Available
Working WAL archiving:          OK
Last Archived WAL:              000000020000000000000016   @   2022-01-04T13:31:12.744637Z

Instances status
Manager Version  Pod name                  Current LSN  Received LSN  Replay LSN  System ID            Primary  Replicating  Replay paused  Pending restart  Status
---------------  --------                  -----------  ------------  ----------  ---------            -------  -----------  -------------  ---------------  ------
1.11.0           cluster-edb-project1-k-1               0/17000110    0/17000110  7045121885460729875  ✗        ✓            ✗              ✗                OK
1.11.0           cluster-edb-project1-k-2  0/17000110                             7045121885460729875  ✓        ✗            ✗              ✗                OK
1.11.0           cluster-edb-project1-k-3               0/17000110    0/17000110  7045121885460729875  ✗        ✓            ✗              ✗                OK
% oc get pod
NAME                                                             READY   STATUS    RESTARTS   AGE
cluster-edb-project1-k-1                                         1/1     Running   0          9m3s
cluster-edb-project1-k-2                                         1/1     Running   0          11d
cluster-edb-project1-k-3                                         1/1     Running   0          11d
postgresql-operator-controller-manager-1-11-0-5bbf8b54bc-dpjwn   1/1     Running   0          18d

cluster-edb-project1-k-1も数分でReplica Roleで復活しました。

5.2 Primary PodのNode障害

続いて、Primary Podが乗っているWorker Nodeの障害を試します。現在Primaryのcluster-edb-project1-k-2のNODE名(ここでは10.244.0.7)を確認します。また、同時にOperator Podが落ちないように、Operator Podが別ノードであることも確認します。

% oc get pod -o wide
NAME                                                             READY   STATUS    RESTARTS   AGE    IP             NODE            NOMINATED NODE   READINESS GATES
cluster-edb-project1-k-1                                         1/1     Running   0          17m    172.17.46.10   10.244.128.16   <none>           <none>
cluster-edb-project1-k-2                                         1/1     Running   0          11d    172.17.24.63   10.244.0.7      <none>           <none>
cluster-edb-project1-k-3                                         1/1     Running   0          11d    172.17.17.94   10.244.64.11    <none>           <none>
postgresql-operator-controller-manager-1-11-0-5bbf8b54bc-dpjwn   1/1     Running   0          18d    172.17.46.7    10.244.128.16   <none>           <none>

ROKSはIBM Cloudの仮想サーバーを使っていますが、このWorkerをハードリブートしてみます。

% ibmcloud oc workers --cluster roks-xxx
OK
ID                                                   プライマリー IP   フレーバー   状態     状況    ゾーン     バージョン
kube-xxx-roksxxx-default-000007cc   10.244.0.7        mx2.4x32     normal   Ready   jp-tok-1   4.8.20_1536_openshift*
kube-xxx-roksxxx-default-0000089a   10.244.64.11      mx2.4x32     normal   Ready   jp-tok-2   4.8.20_1536_openshift*
kube-xxx-roksxxx-default-00000a30   10.244.128.16     mx2.4x32     normal   Ready   jp-tok-3   4.8.22_1538_openshift*

% date +%Y%m%d%H%M%S; ibmcloud oc worker reboot --hard --cluster roks-xxx --worker kube-xxx-roksxxx-default-000007cc
20220104224819
ワーカーをリブートしますか? [kube-xxx-roksxxx-default-000007cc] [y/N]> y
kube-xxx-roksxxx-default-000007cc を処理しています...
kube-xxx-roksxxx-default-000007cc の処理が完了しました。

アプリアクセスを確認します。
エラーが出ましたが、1分弱で復活しました。

% while true; do curl -X PUT -H "Content-Type: application/json" -d 'test11111' http://edb-sample-edb-poc.roks-xxx-0000.jp-tok.containers.appdomain.cloud/; ; date +%Y%m%d%H%M%S; sleep 1; done
true20220104224830
20220104224831　　　　　　　　  # リブート後10秒程度でエラーが出始める
<H1>Error Page Exception</H1>
<H4>SRVE0260E: The server cannot use the error page specified for your application to handle the Original Exception printed below.</H4>
<BR><H3>Original Exception: </H3>
<B>Error Message: </B>org.springframework.web.util.NestedServletException: Request processing failed&#59; nested exception is org.mybatis.spring.MyBatisSystemException: nested exception is org.apache.ibatis.exceptions.PersistenceException:
### Error updating database.  Cause: org.springframework.jdbc.CannotGetJdbcConnectionException: Failed to obtain JDBC Connection&#59; nested exception is java.sql.SQLTransientConnectionException: HikariPool-1 - Connection is not available, request timed out after 10000ms.
〜
<BR>
20220104224914
true20220104224919　　　      # リブートから1分弱でアプリアクセスが復活
true20220104224920

Primary Roleはcluster-edb-project1-k-1にFail Overしていました。

% oc get pod -o wide
NAME                                                             READY   STATUS        RESTARTS   AGE   IP             NODE            NOMINATED NODE   READINESS GATES
cluster-edb-project1-k-1                                         1/1     Running       0          32m   172.17.46.10   10.244.128.16   <none>           <none>
cluster-edb-project1-k-2                                         1/1     Terminating   0          11d   172.17.24.63   10.244.0.7      <none>           <none>
cluster-edb-project1-k-3                                         1/1     Running       0          11d   172.17.17.94   10.244.64.11    <none>           <none>
postgresql-operator-controller-manager-1-11-0-5bbf8b54bc-dpjwn   1/1     Running       0          18d   172.17.46.7    10.244.128.16   <none>           <none>
% kubectl cnp status cluster-edb-project1-k
Failing over   Failing over to cluster-edb-project1-k-1
Name:              cluster-edb-project1-k
Namespace:         edb-poc
PostgreSQL Image:  quay.io/enterprisedb/postgresql:14.1
Primary instance:  cluster-edb-project1-k-1
Instances:         3
Ready instances:   2
Current Timeline:  3
Current WAL file:  00000003000000000000001B

Continuous Backup status
First Point of Recoverability:  Not Available
Working WAL archiving:          OK
Last Archived WAL:              00000003000000000000001B   @   2022-01-04T13:59:14.365228Z

Instances status
Manager Version  Pod name                  Current LSN  Received LSN  Replay LSN  System ID            Primary  Replicating  Replay paused  Pending restart  Status
---------------  --------                  -----------  ------------  ----------  ---------            -------  -----------  -------------  ---------------  ------
1.11.0           cluster-edb-project1-k-1  0/1C000000                             7045121885460729875  ✓        ✗            ✗              ✗                OK
45               cluster-edb-project1-k-2  -            -             -           -                    -        -            -              -                pod not available
1.11.0           cluster-edb-project1-k-3               0/1C000000    0/1C000000  7045121885460729875  ✗        ✓            ✗              ✗                OK
%

※hard reboot方法が悪かったのか、Workerがなかなか起動してこないのでreloadしました。。また、backup方法も悪かったのか、、Podも復旧しないため、PV/PVCごとdeleteしました。しばらくすると新たなインスタンスが作成されました。

# 障害Podの削除
% PODNAME=cluster-edb-project1-k-2
% VOLNAME=$(kubectl get pv -o json | \
  jq -r '.items[]|select(.spec.claimRef.name=='\"$PODNAME\"')|.metadata.name')
% kubectl delete pod/$PODNAME pvc/$PODNAME pv/$VOLNAME

# しばらくすると新たなインスタンスが作成されました。
% oc get pod
NAME                                                             READY   STATUS    RESTARTS   AGE
cluster-edb-project1-k-1                                         1/1     Running   0          18h
cluster-edb-project1-k-3                                         1/1     Running   0          12d
cluster-edb-project1-k-4                                         1/1     Running   0          4m8s
postgresql-operator-controller-manager-1-11-0-5bbf8b54bc-dpjwn   1/1     Running   1          19d

参照：https://www.enterprisedb.com/docs/kubernetes/cloud_native_postgresql/troubleshooting/#replicas-out-of-sync-when-no-backup-is-configured

5.3 NW障害（Replica Pod1のAZ障害）

AZのNW障害ケースをテストしたいと思います。まずは、Replica Podの内1Podが存在するAZのNWが停止した際に、Primary Podへのアプリアクセスが継続できるかの確認です。（※某RDBのHAソリューションでは同期先が1箇所の制約があるため、このケースでは影響をうけます。）

*IBM Cloud ROKSではACLによるNW障害がうまく再現できなかったため、本ケースはAWS ROSAで検証しています。

Replica Podの存在するAZを確認します。 cluster-edb-1-2と cluster-edb-1-3がReplicaで、後者がAZのsubnet(10.0.192.0/19)を確認します。

Instances status
Manager Version  Pod name         Current LSN  Received LSN  Replay LSN  System ID            Primary  Replicating  Replay paused  Pending restart  Status
---------------  --------         -----------  ------------  ----------  ---------            -------  -----------  -------------  ---------------  ------
1.13.0           cluster-edb-1-1  0/D000000                              7067095012841758740  ✓        ✗            ✗              ✗                OK
1.13.0           cluster-edb-1-2               0/D000000     0/D000000   7067095012841758740  ✗        ✓            ✗              ✗                OK
1.13.0           cluster-edb-1-3               0/D000000     0/D000000   7067095012841758740  ✗        ✓            ✗              ✗                OK


% oc get pod -o wide
NAME                                                             READY   STATUS    RESTARTS      AGE    IP             NODE                           NOMINATED NODE   READINESS GATES
cluster-edb-1-1                                                  1/1     Running   0             10d    10.129.2.20    ip-10-0-180-125.ec2.internal   <none>           <none>
cluster-edb-1-2                                                  1/1     Running   0             10d    10.128.2.21    ip-10-0-129-231.ec2.internal   <none>           <none>
cluster-edb-1-3                                                  1/1     Running   0             10d    10.131.0.30    ip-10-0-200-55.ec2.internal    <none>           <none>

アプリアクセスを継続した状態で、AZ1のサブネットをACLでinもoutもBlockしてみます。

アプリアクセスは遅延もエラーもなく継続できており、Primary RoleのFail Overも起こりませんでした。
BlockしたAZ上のReplica PodのStatusはpod not availableになりましたが、ACL復旧後Podも復活しています。一時的にReplica Podが利用不可でも、もうひとつのAZのReplicaにSyncレプリされていることで、Primaryへのアプリアクセスには影響が出ず業務継続できていることがわかりました。

5.4 NW障害（Primary PodのAZ障害）

続いてPrimary PodがいるAZのNW障害ケースをテストをしました。
AZのサブネットのinとoutをACLで遮断後、CNPのFail Overが発生し、約1分40秒で業務が再開しました。

5.5 Storage障害（Primary PodのPV障害）

Primary PodのPVのStorage障害ケースをテストをしました。
PVは、IBM CloudのBlock Storageからプロビジョニングしています。
Primary Podにアタッチされているストレージを、IBM Cloud側のコマンドで強制削除します。

cluster-edb-project1-b-1 PodがPrimaryで、そのPodに紐づくPVCからボリューム名を確認します。

Instances status
Manager Version  Pod name                  Current LSN  Received LSN  Replay LSN  System ID            Primary  Replicating  Replay paused  Pending restart  Status
---------------  --------                  -----------  ------------  ----------  ---------            -------  -----------  -------------  ---------------  ------
1.13.0           cluster-edb-project1-b-1  0/32000000                             7055094441461866515  ✓        ✗            ✗              ✗                OK
1.13.0           cluster-edb-project1-b-2               0/32000000    0/32000000  7055094441461866515  ✗        ✓            ✗              ✗                OK
1.13.0           cluster-edb-project1-b-3               0/32000000    0/32000000  7055094441461866515  ✗        ✓            ✗              ✗                OK
% oc get pod -o wide
NAME                                                            READY   STATUS    RESTARTS   AGE   IP             NODE           NOMINATED NODE   READINESS GATES
cluster-edb-project1-b-1                                        1/1     Running   0          8d    172.17.18.95   10.244.128.7   <none>           <none>
cluster-edb-project1-b-2                                        1/1     Running   0          8d    172.17.13.12   10.244.0.68    <none>           <none>
cluster-edb-project1-b-3                                        1/1     Running   0          12d   172.17.9.92    10.244.64.4    <none>           <none>
% oc get pvc
NAME                              STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS                     AGE
cluster-edb-project1-b-1          Bound    pvc-15372474-603c-48ff-b8e3-2714843a1fbf   10Gi       RWO            ibmc-vpc-block-general-purpose   41d
cluster-edb-project1-b-2          Bound    pvc-d7dfc673-de07-4ee1-b5c8-64ce87291b45   10Gi       RWO            ibmc-vpc-block-general-purpose   41d
cluster-edb-project1-b-3          Bound    pvc-691fc284-dc9b-4f4f-b115-699854d5802d   10Gi       RWO            ibmc-vpc-block-general-purpose   41d

IBM Cloud側のコマンドで、先程調べたPVのボリューム名に紐づくボリュームIDを確認し、ibmcloud ks storage attachment rmコマンドで、PVを強制削除します。

% ibmcloud ks storage attachment ls --cluster roks-edb --worker kube-c7jvl93t0c8s010po840-roksedb-default-0000024c
ボリューム接続をリスト中...
OK
ID                                          名前                                   状況       タイプ   ボリューム ID                               ボリューム名                               ワーカー ID
02g7-403e990d-8e13-4869-868d-50ab75654094   lisp-affix-copurify-dill               attached   data     r022-900cdee8-aaa1-4c4c-82be-e55d8a2235c2   pvc-15372474-603c-48ff-b8e3-2714843a1fbf   kube-c7jvl93t0c8s010po840-roksedb-default-0000024c
02g7-22ddf45c-cec3-4485-928d-a05365bb71eb   distincted-lively-kilometer-monotype   attached   data     r022-80609886-e7ed-4ad2-bd86-f0970d852c2d   pvc-bf632600-f297-465f-b51c-2f1730b76a04   kube-c7jvl93t0c8s010po840-roksedb-default-0000024c
02g7-9e903ea3-0036-4721-abfe-8062a1ae4a91   cordovan-volley-coexist-bless          attached   boot     r022-a08ff858-3143-4fb2-8cf9-cf8004e50757   curfew-huffy-sweep-amply                   kube-c7jvl93t0c8s010po840-roksedb-default-0000024c

% ibmcloud ks storage attachment rm --cluster roks-edb --attachment 02g7-403e990d-8e13-4869-868d-50ab75654094 --worker kube-c7jvl93t0c8s010po840-roksedb-default-0000024c
ボリューム接続を削除中...
OK

PrimaryのPV障害時はReplicaにFail Over(約20秒)し、サービス継続していることを確認しました。

true20220302222419
true20220302222420
true20220302222421
true20220302222423
true20220302222424
true20220302222425

<H1>Error Page Exception</H1>
<H4>SRVE0260E: The server cannot use the error page specified for your application to handle the Original Exception printed below.</H4>
<BR><H3>Original Exception: </H3>
<B>Error Message: </B>org.springframework.web.util.NestedServletException: Request processing failed&#59; nested exception is org.springframework.dao.DataAccessResourceFailureException:

〜中略〜

<BR>&nbsp;&nbsp;&nbsp;&nbsp;
<BR>
20220302222440
true20220302222446                           ←failover完了しアプリアクセス成功　約20秒

Primary Podのログからは、2秒くらいでエラー出力し、数秒後podが落ちる挙動をしていました。

{"level":"info","ts":1646227467.2363865,"logger":"postgres","msg":"record","logging_pod":"cluster-edb-project1-b-1","record":{"log_time":"2022-03-02 13:24:27.236 UTC","user_name":"app","database_name":"app","process_id":"563","connection_from":"172.17.9.88:51772","session_id":"621f6fef.233","session_line_num":"1","command_tag":"UPDATE","session_start_time":"2022-03-02 13:23:59 UTC","virtual_transaction_id":"5/138","transaction_id":"1020","error_severity":"PANIC","sql_state_code":"58030","message":"could not fdatasync file \"0000000F000000000000003A\": Input/output error","application_name":"PostgreSQL JDBC Driver","backend_type":"client backend","query_id":"0"}}
{"level":"info","ts":1646227467.239056,"logger":"postgres","msg":"record","logging_pod":"cluster-edb-project1-b-1","record":{"log_time":"2022-03-02 13:24:27.238 UTC","process_id":"21","session_id":"621f6cfa.15","session_line_num":"11","session_start_time":"2022-03-02 13:11:22 UTC","transaction_id":"0","error_severity":"LOG","sql_state_code":"00000","message":"server process (PID 563) was terminated by signal 6: Aborted","detail":"Failed process was running: Update SampleTBL Set name = $1 WHERE id = $2","backend_type":"postmaster","query_id":"0"}}
{"level":"info","ts":1646227467.2393942,"logger":"postgres","msg":"record","logging_pod":"cluster-edb-project1-b-1","record":{"log_time":"2022-03-02 13:24:27.238 UTC","process_id":"21","session_id":"621f6cfa.15","session_line_num":"12","session_start_time":"2022-03-02 13:11:22 UTC","transaction_id":"0","error_severity":"LOG","sql_state_code":"00000","message":"terminating any other active server processes","backend_type":"postmaster","query_id":"0"}}
{"level":"info","ts":1646227467.2471857,"logger":"postgres","msg":"record","logging_pod":"cluster-edb-project1-b-1","record":{"log_time":"2022-03-02 13:24:27.246 UTC","process_id":"66","session_id":"621f6cfe.42","session_line_num":"1","session_start_time":"2022-03-02 13:11:26 UTC","transaction_id":"0","error_severity":"LOG","sql_state_code":"58P01","message":"could not open temporary statistics file \"pg_stat/global.tmp\": No such file or directory","backend_type":"stats collector","query_id":"0"}}
{"level":"info","ts":1646227467.2482495,"logger":"postgres","msg":"record","logging_pod":"cluster-edb-project1-b-1","record":{"log_time":"2022-03-02 13:24:27.248 UTC","process_id":"21","session_id":"621f6cfa.15","session_line_num":"13","session_start_time":"2022-03-02 13:11:22 UTC","transaction_id":"0","error_severity":"LOG","sql_state_code":"00000","message":"shutting down because restart_after_crash is off","backend_type":"postmaster","query_id":"0"}}
{"level":"info","ts":1646227467.2553608,"logger":"postgres","msg":"record","logging_pod":"cluster-edb-project1-b-1","record":{"log_time":"2022-03-02 13:24:27.255 UTC","process_id":"21","session_id":"621f6cfa.15","session_line_num":"14","session_start_time":"2022-03-02 13:11:22 UTC","transaction_id":"0","error_severity":"LOG","sql_state_code":"00000","message":"database system is shut down","backend_type":"postmaster","query_id":"0"}}
{"level":"error","ts":1646227467.2579408,"msg":"PostgreSQL process exited with errors","logging_pod":"cluster-edb-project1-b-1","error":"exit status 1","stacktrace":"github.com/EnterpriseDB/cloud-native-postgresql/pkg/management/log.Error\n\tpkg/management/log/log.go:155\ngithub.com/EnterpriseDB/cloud-native-postgresql/internal/cmd/manager/instance/run.runSubCommand\n\tinternal/cmd/manager/instance/run/cmd.go:175\ngithub.com/EnterpriseDB/cloud-native-postgresql/internal/cmd/manager/instance/run.NewCmd.func1\n\tinternal/cmd/manager/instance/run/cmd.go:49\ngithub.com/spf13/cobra.(*Command).execute\n\tpkg/mod/github.com/spf13/cobra@v1.3.0/command.go:856\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\tpkg/mod/github.com/spf13/cobra@v1.3.0/command.go:974\ngithub.com/spf13/cobra.(*Command).Execute\n\tpkg/mod/github.com/spf13/cobra@v1.3.0/command.go:902\nmain.main\n\tcmd/manager/main.go:51\nruntime.main\n\t/opt/hostedtoolcache/go/1.17.7/x64/src/runtime/proc.go:255"}
{"level":"info","ts":1646227467.263468,"logger":"pg_controldata","msg":"pg_controldata: fatal: could not open file \"/var/lib/postgresql/data/pgdata/global/pg_control\" for reading: No such file or directory\n","pipe":"stderr","logging_pod":"cluster-edb-project1-b-1"}
{"level":"error","ts":1646227467.2634845,"msg":"Error printing the control information of this PostgreSQL instance","logging_pod":"cluster-edb-project1-b-1","error":"exit status 1","stacktrace":"github.com/EnterpriseDB/cloud-native-postgresql/pkg/management/log.Error\n\tpkg/management/log/log.go:155\ngithub.com/EnterpriseDB/cloud-native-postgresql/pkg/management/postgres.(*Instance).LogPgControldata\n\tpkg/management/postgres/instance.go:671\ngithub.com/EnterpriseDB/cloud-native-postgresql/internal/cmd/manager/instance/run.runSubCommand\n\tinternal/cmd/manager/instance/run/cmd.go:179\ngithub.com/EnterpriseDB/cloud-native-postgresql/internal/cmd/manager/instance/run.NewCmd.func1\n\tinternal/cmd/manager/instance/run/cmd.go:49\ngithub.com/spf13/cobra.(*Command).execute\n\tpkg/mod/github.com/spf13/cobra@v1.3.0/command.go:856\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\tpkg/mod/github.com/spf13/cobra@v1.3.0/command.go:974\ngithub.com/spf13/cobra.(*Command).Execute\n\tpkg/mod/github.com/spf13/cobra@v1.3.0/command.go:902\nmain.main\n\tcmd/manager/main.go:51\nruntime.main\n\t/opt/hostedtoolcache/go/1.17.7/x64/src/runtime/proc.go:255"}
Error: exit status 1

また、アプリアクセスがない状態（アプリのDB書き込みがない状態）でも、Probeでストレージ障害を検知可能であることを確認しました。

5.6 Operator障害

最後に、Operator障害の検証を行いました。デフォルトではOperatorは1Podとなり、Primary Podと同一AZにいる場合、そのAZ障害のケースではすぐにFail Overが発生しませんでした。（8分程度でOpenShiftが該当ノードアクセス不可を検知してOperator Podを別zoneで再作成し、Fail Over完了してアプリアクセス再開）

そのため、Operator自体の冗長化が必要と考えられます。

CNP Operatorの冗長化はleader electionモードでサポートされています。

If you require high availability at the operator level, it is possible to specify multiple replicas in the Deployment configuration - given that the operator supports leader election. Also, you can take advantage of taints and tolerations to make sure that the operator does not run on the same nodes where the actual PostgreSQL clusters are running (this might even include the control plane for self-managed Kubernetes installations).

参考：https://www.enterprisedb.com/docs/postgres_for_kubernetes/latest/installation_upgrade/#details-about-the-deployment

Operatorの冗長化を試すために、OperatorのDeploymentを直接変えてみましたが、すぐ戻されました。この変更方法では駄目なようです。

OperatorのCSVという定義にReplica数があったのでここを変えたら冗長化ができました。

Operatorが冗長化された状態で、Operator(Leader)とCNPのPrmaryPodが乗っているAZの障害を試しました。

Instances status
Manager Version  Pod name         Current LSN  Received LSN  Replay LSN  System ID            Primary  Replicating  Replay paused  Pending restart  Status
---------------  --------         -----------  ------------  ----------  ---------            -------  -----------  -------------  ---------------  ------
1.15.0           cluster-edb-1-1  0/56001350                             7067095012841758740  ✓        ✗            ✗              ✗                OK
1.15.0           cluster-edb-1-2               0/56001350    0/56001350  7067095012841758740  ✗        ✓            ✗              ✗                OK
1.15.0           cluster-edb-1-3               0/56001350    0/56001350  7067095012841758740  ✗        ✓            ✗              ✗                OK

% oc get po -o wide
NAME                                                             READY   STATUS      RESTARTS        AGE     IP             NODE                           NOMINATED NODE   READINESS GATES
cluster-edb-1-1                                                  1/1     Running     1 (16m ago)     85m     10.129.5.60    ip-10-0-167-76.ec2.internal    <none>           <none>
cluster-edb-1-2                                                  1/1     Running     1 (39s ago)     18m     10.128.2.59    ip-10-0-129-231.ec2.internal   <none>           <none>
cluster-edb-1-3                                                  1/1     Running     0               33d     10.131.1.165   ip-10-0-200-55.ec2.internal    <none>           <none>
postgresql-operator-controller-manager-1-15-0-7b86778c55-jczt5   1/1     Running     3 (3m58s ago)   41m     10.128.2.57    ip-10-0-129-231.ec2.internal   <none>           <none>
postgresql-operator-controller-manager-1-15-0-7b86778c55-tlbcx   1/1     Running     0               38m     10.129.5.89    ip-10-0-167-76.ec2.internal    <none>           <none>

ACL遮断前後でのアプリアクセスを確認します。

20220526002638
200
20220526002639
200
20220526002641
200
20220526002642  ←NW断
504
20220526002714　
400
20220526002725　
200
20220526002727　←45秒後復活
200
20220526002728

Operatorの切替が行われたのち、CNPのFail Overが発生し、45秒程度でアプリアクセスが復活することが確認できました。

6 まとめ

障害テストケースまとめ

No	障害箇所	切替時間
1	Pod障害	約1秒
2	Node障害	約20秒
3	AZ/NW障害(Replica)	アプリ無影響
4	AZ/NW障害(Primary)	約1分40秒
5	Storage障害	約20秒
6	Operator(シングル)+Primary Pod障害	約8分
6'	Operator(冗長化)+Primary Pod障害	約45秒

コンテナ版EDB PostgreSQLは、Operatorが提供されており、導入・構成は数クリックでとても簡単でした。SyncモードでのReplica数も複数持て、切り替わり時間もKubernetesのServiceを活用し、チューニングなしでもとても早いです。クラウド上のマルチAZ高可用性構成が難しい、某商用RDBソリューションに比べ、高可用性要件が求められるエンタープライズ向けシステムでの活用も期待される製品だと感じました。

参考文献:
https://www.enterprisedb.com/docs/kubernetes/cloud_native_postgresql/
https://edbpostgres.sios.jp/cnp-webinar/
https://catalog.redhat.com/software/operators/detail/5fb41c88abd2a6f7dbe1b37b

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up