More than 5 years have passed since last update.

オンプレミスで高可用性Pod公開方法その1 (NodePort + HAProxy + Pacemaker)

Last updated at 2018-07-30Posted at 2018-07-28

はじめに

オンプレミスでKubernetesを構築する際に、事前に準備されているパブリッククラウドと違い、様々な点を考慮する必要があります。現段階で思いつく限り、以下の点が挙げられます。

高可用性なPodの公開方法
PodNetworkの性能 (出来るだけトンネリングは排除したい)
ストレージ (Dynamic Provisioningしたい)
Kubernetes基盤自体の高可用性
監視
Local Registory
CICD
YAML管理
ログ管理
サービスダウンゼロバージョンアップ
Kubernetes自体(Etcd)のバックアップ, リストア
Kuberentes(Master/Node)の追加/削除/入れ替え

上記の考慮点のうち、今回は「高可用性なPodの公開方法」を技術調査して行こうと思います。
オンプレミスで高可用性なPodを提供するには、様々な選択肢があります。

NodePort + LB(HAProxy)
MetalLB
kube-router
kube-keepalived-vip
goでService Loadbalancer用プラグイン作成

今回はこの中で一番はじめに思いつく、素直？お手軽？な「NodePort + LB(HAProxy)」を検証していきます。

構成図

以下の構成を作成していきます。Kubernetesクラスタとは別の管理で、外部のLB(HAProxy)を構築して行きます。
特に Cloud Provider といったKubernetes連携は行わない形となります。

LBとしてHAProxyを採用 (UDP通信が出来ないのはとりあえず無視)
HAProxyの高可用性(Active - Standby)として、Pacemakerを利用
HAProxyがKubernetesNodeへ分散

HAProxy用仮想マシン作成

Kubernetesクラスタ自体は作成されている前提で、HAProxyを仮想マシンで作成します。

HaProxy1

virt-install \
 --connect qemu:///system \
 --name haproxy1 \
 --memory 4096 \
 --disk size=100,bus=virtio \
 --vcpus 2 \
 --graphics vnc,listen=0.0.0.0,keymap=ja \
 --cdrom=/var/combase_iso/CentOS-7-x86_64-DVD-1804.iso \
 --network virtualport_type=openvswitch,source=mgmt,source_mode=bridge,model=virtio  \
 --clock offset=utc \
 --dry-run

HaProxy2

virt-install \
 --connect qemu:///system \
 --name haproxy2 \
 --memory 4096 \
 --disk size=100,bus=virtio \
 --vcpus 2 \
 --graphics vnc,listen=0.0.0.0,keymap=ja \
 --cdrom=/var/combase_iso/CentOS-7-x86_64-DVD-1804.iso \
 --network virtualport_type=openvswitch,source=mgmt,source_mode=bridge,model=virtio  \
 --clock offset=utc \
 --dry-run

HAProxy Install

CentOSの公式ReposuitoryのHAProxyは、バージョンが1.5 と古いので、安定版の1.8.12 を導入します
前提パッケージを導入します

yum install gcc pcre-static pcre-devel -y

wget で tar.gz をダウンロードします

mkdir /root/haproxy
wget http://www.haproxy.org/download/1.8/src/haproxy-1.8.12.tar.gz -O /root/haproxy/haproxy.tar.gz

解凍します

cd /root/haproxy/
tar xfvz haproxy.tar.gz

移動します

cd /root/haproxy/haproxy-1.8.12

コンパイルします

make TARGET=linux2628

installします

make install

実行例

[root@haproxy1 haproxy-1.8.12]# make install
install -d "/usr/local/sbin"
install haproxy  "/usr/local/sbin"
install -d "/usr/local/share/man"/man1
install -m 644 doc/haproxy.1 "/usr/local/share/man"/man1
install -d "/usr/local/doc/haproxy"
for x in configuration management architecture peers-v2.0 cookie-options lua WURFL-device-detection proxy-protocol linux-syn-cookies network-namespaces DeviceAtlas-device-detection 51Degrees-device-detection netscaler-client-ip-insertion-protocol peers close-options SPOE intro; do \
        install -m 644 doc/$x.txt "/usr/local/doc/haproxy" ; \
done
[root@haproxy1 haproxy-1.8.12]#

必要なディレクトリなど作成

mkdir -p /etc/haproxy
mkdir -p /var/lib/haproxy 
touch /var/lib/haproxy/stats

normal user で HAProxy command を使用するためにシンボリックシンクを作成

ln -s /usr/local/sbin/haproxy /usr/sbin/haproxy

HAProxyを systemd 配下に置くため、/etc/initd へ配置します

cp /root/haproxy/haproxy-1.8.12/examples/haproxy.init /etc/init.d/haproxy
chmod 755 /etc/init.d/haproxy
systemctl daemon-reload

systemd で有効にします

systemctl enable haproxy

user を追加します

sudo useradd -r haproxy

haproxyのバージョンを確認します

[root@haproxy1 ~]# haproxy -v
HA-Proxy version 1.8.12-8a200c7 2018/06/27
Copyright 2000-2018 Willy Tarreau <willy@haproxy.org>

HAProxyがcheckに使用する監視用Pod作成

HAProxyが、Kubernetesクラスタへロードバランス実施する際に、backendサーバ群が稼働しているかを確認する必要があります。
Node自体が正常に稼働している状況を確認する機能が必要です。
DaemonsetでPodを作成し、そのPodへ正常にTCP通信が出来ることを確認することで、正常状態と判断する形にしたいと思います。

80番以外でPodを公開したいため、Nginxイメージをベースに作成して、DockerHubで公開しました。
作成した詳細は以下を参照してください。
https://qiita.com/sugimount/items/0d1e35c271939d6f66f1

以下のマニフェストファイルを準備します
HAProxyサーバから、直接Podへアクセスしたいため、hostNetworkを有効にしています。

mkdir /root/manifests/
cat <<'EOF' > /root/manifests/node-healthcheck.yaml
apiVersion: v1
kind: Namespace
metadata:
  name: node-healthcheck
  labels:
    nsname: node-healthcheck

---

apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  name: node-healthcheck
  namespace: node-healthcheck
  labels:
    app: node-healthcheck
spec:
  template:
    metadata:
      labels:
        app: node-healthcheck
    spec:
      hostNetwork: true
      containers:
      - name: node-healthcheck
        image: sugimount/node-healthcheck:0.0.1
        resources:
          requests:
            cpu: 100m
            memory: 100Mi
        ports:
        - containerPort: 29499
          hostPort: 29499
          protocol: TCP
      restartPolicy: Always
EOF
kubectl apply -f /root/manifests/node-healthcheck.yaml

作成されたことを確認します

root@ntw-k8s-master01(node-healthcheck kubernetes-admin):~# kubectl get pods -o wide
NAME                     READY     STATUS    RESTARTS   AGE       IP             NODE
node-healthcheck-99rs6   1/1       Running   0          3m        10.44.194.62   ntw-k8s-nodevm01
node-healthcheck-f4kgq   1/1       Running   0          3m        10.44.194.85   ntw-k8s-nodegpu01

HAProxyサーバからcurlで正常にWelcomeページが表示されることを確認します

curl http://10.44.194.62:29499/
curl http://10.44.194.85:29499/

HAProxy Settings

confファイルを設定します

「frontend k8s_nodeport_front」でNodePortへのfrontendを設定してます
「k8s_nodeport_backend」で、NodePortとしてKubernetes Node を指定しています。ここのセクションでPort番号を省略すると、frontendで利用するのPort番号をそのまま使用する仕様を利用しています。
HAProxyがNodeを正常と判断するために、checkのパラメータを指定して、node-healthcheckのポートを指定します
Nodeが復元するときに、HAProxyが正常と判断するのを遅くしたいため、「rise 15」(15回連続で通信確認出来ると正常状態と判断)とします。defaultの5回では、5秒ほど通信断があったためです。

cat <<'EOF' > /etc/haproxy/haproxy.cfg
# ---------------------------------------------------------------------
# Example configuration for a possible web application.  See the
# full configuration options online.
#
#   http://haproxy.1wt.eu/download/1.4/doc/configuration.txt
#
# ---------------------------------------------------------------------

# ---------------------------------------------------------------------
# Global settings
# ---------------------------------------------------------------------
global
    # to have these messages end up in /var/log/haproxy.log you will
    # need to:
    #
    # 1) configure syslog to accept network log events.  This is done
    #    by adding the '-r' option to the SYSLOGD_OPTIONS in
    #    /etc/sysconfig/syslog
    #
    # 2) configure local2 events to go to the /var/log/haproxy.log
    #   file. A line like the following can be added to
    #   /etc/sysconfig/syslog
    #
    #    local2.*                       /var/log/haproxy.log
    #
    log         127.0.0.1 local2

    chroot      /var/lib/haproxy
    pidfile     /var/run/haproxy.pid
    maxconn     4000
    user        haproxy
    group       haproxy
    daemon

    # turn on stats unix socket
    stats socket /var/lib/haproxy/stats

# ---------------------------------------------------------------------
# common defaults that all the 'listen' and 'backend' sections will
# use if not designated in their block
# ---------------------------------------------------------------------
defaults
    mode                    tcp
    log                     global
    # option                  httplog
    option                  dontlognull
    option http-server-close
    option forwardfor       except 127.0.0.0/8
    option                  redispatch
    retries                 1
    timeout http-request    5s
    timeout queue           5s
    timeout connect         5s
    timeout client          5s
    timeout server          5s
    timeout http-keep-alive 5s
    timeout check           5s
    maxconn                 3000

# ---------------------------------------------------------------------
# main frontend which proxys to the backends
# ---------------------------------------------------------------------
frontend k8s_nodeport_front
    bind *:30000-32767
    default_backend k8s_nodeport_backend

# ---------------------------------------------------------------------
# static backend for serving up images, stylesheets and such
# ---------------------------------------------------------------------
backend k8s_nodeport_backend
    balance     roundrobin
    server      node01 10.44.194.62 check port 29499 inter 5s rise 15 fall 1
    server      node02 10.44.194.85 check port 29499 inter 5s rise 15 fall 1
EOF

再起動

systemctl restart haproxy

rsyslog設定

HAProxyが直接FileへLogを出力する機能はなさそうなので、rsyslog経由でlogファイルを出力します。udpを受信するように設定します。

cat <<'EOF' > /etc/rsyslog.conf
# rsyslog configuration file

# For more information see /usr/share/doc/rsyslog-*/rsyslog_conf.html
# If you experience problems, see http://www.rsyslog.com/doc/troubleshoot.html

#### MODULES ####

# The imjournal module bellow is now used as a message source instead of imuxsock.
$ModLoad imuxsock # provides support for local system logging (e.g. via logger command)
$ModLoad imjournal # provides access to the systemd journal
# $ModLoad imklog # reads kernel messages (the same are read from journald)
# $ModLoad immark  # provides --MARK-- message capability

# Provides UDP syslog reception
# $ModLoad imudp
$ModLoad imudp
# $UDPServerRun 514
$UDPServerRun 514

# Provides TCP syslog reception
# $ModLoad imtcp
# $InputTCPServerRun 514


#### GLOBAL DIRECTIVES ####

# Where to place auxiliary files
$WorkDirectory /var/lib/rsyslog

# Use default timestamp format
$ActionFileDefaultTemplate RSYSLOG_TraditionalFileFormat

# File syncing capability is disabled by default. This feature is usually not required,
# not useful and an extreme performance hit
# $ActionFileEnableSync on

# Include all config files in /etc/rsyslog.d/
$IncludeConfig /etc/rsyslog.d/*.conf

# Turn off message reception via local log socket;
# local messages are retrieved through imjournal now.
$OmitLocalLogging on

# File to store the position in the journal
$IMJournalStateFile imjournal.state


#### RULES ####

# Log all kernel messages to the console.
# Logging much else clutters up the screen.
# kern.*                                                 /dev/console

# Log anything (except mail) of level info or higher.
# Don't log private authentication messages!
*.info;mail.none;authpriv.none;cron.none                /var/log/messages

# The authpriv file has restricted access.
authpriv.*                                              /var/log/secure

# Log all the mail messages in one place.
mail.*                                                  -/var/log/maillog


# Log cron stuff
cron.*                                                  /var/log/cron

# Everybody gets emergency messages
*.emerg                                                 :omusrmsg:*

# Save news errors of level crit and higher in a special file.
uucp,news.crit                                          /var/log/spooler

# Save boot messages also to boot.log
local7.*                                                /var/log/boot.log


# ### begin forwarding rule ###
# The statement between the begin ... end define a SINGLE forwarding
# rule. They belong together, do NOT split them. If you create multiple
# forwarding rules, duplicate the whole block!
# Remote Logging (we use TCP for reliable delivery)
#
# An on-disk queue is created for this action. If the remote host is
# down, messages are spooled to disk and sent when it is up again.
# $ActionQueueFileName fwdRule1 # unique name prefix for spool files
# $ActionQueueMaxDiskSpace 1g   # 1gb space limit (use as much as possible)
# $ActionQueueSaveOnShutdown on # save messages to disk on shutdown
# $ActionQueueType LinkedList   # run asynchronously
# $ActionResumeRetryCount -1    # infinite retries if host is down
# remote host is: name/ip:port, e.g. 192.168.0.1:514, port optional
# *.* @@remote-host:514
# ### end of the forwarding rule ###
EOF

haproxyの受信用設定を行います

cat <<'EOF' > /etc/rsyslog.d/haproxy.conf
local2.info                       /var/log/haproxy.log
EOF

rsyslogを再起動します

systemctl restart rsyslog

logrotate設定

10日分のlogを保管するlogrotate設定を行います

cat <<'EOF' > /etc/logrotate.d/haproxy
/var/log/haproxy.log {
    daily
    rotate 10
    missingok
    notifempty
    compress
    sharedscripts
    postrotate
        /bin/kill -HUP `cat /var/run/syslogd.pid 2> /dev/null` 2> /dev/null || true
        /bin/kill -HUP `cat /var/run/rsyslogd.pid 2> /dev/null` 2> /dev/null || true
    endscript
}
EOF

Pacemaker Install

HAProxy冗長化をするために、今回はPacemakerを利用します。

yum install -y corosync pacemaker pcs

依存関係のmemo

==========================================================================================================================================================
 Package                                              Arch                    Version                                      Repository                Size
==========================================================================================================================================================
Installing:
 corosync                                             x86_64                  2.4.3-2.el7_5.1                              updates                  220 k
 pacemaker                                            x86_64                  1.1.18-11.el7_5.3                            updates                  456 k
 pcs                                                  x86_64                  0.9.162-5.el7.centos.1                       updates                  5.0 M
Installing for dependencies:
 avahi-libs                                           x86_64                  0.6.31-19.el7                                base                      61 k
 bc                                                   x86_64                  1.06.95-13.el7                               base                     115 k
 cifs-utils                                           x86_64                  6.2-10.el7                                   base                      85 k
 clufter-bin                                          x86_64                  0.77.0-2.el7                                 base                      25 k
 clufter-common                                       noarch                  0.77.0-2.el7                                 base                      72 k
 corosynclib                                          x86_64                  2.4.3-2.el7_5.1                              updates                  132 k
 cups-libs                                            x86_64                  1:1.6.3-35.el7                               base                     357 k
 fontpackages-filesystem                              noarch                  1.44-8.el7                                   base                     9.9 k
 gnutls                                               x86_64                  3.3.26-9.el7                                 base                     677 k
 gssproxy                                             x86_64                  0.7.0-17.el7                                 base                     108 k
 keyutils                                             x86_64                  1.5.8-3.el7                                  base                      54 k
 libbasicobjects                                      x86_64                  0.1.1-29.el7                                 base                      25 k
 libcgroup                                            x86_64                  0.41-15.el7                                  base                      65 k
 libcollection                                        x86_64                  0.7.0-29.el7                                 base                      41 k
 liberation-fonts-common                              noarch                  1:1.07.2-16.el7                              base                      27 k
 liberation-sans-fonts                                noarch                  1:1.07.2-16.el7                              base                     279 k
 libevent                                             x86_64                  2.0.21-4.el7                                 base                     214 k
 libini_config                                        x86_64                  1.3.1-29.el7                                 base                      63 k
 libldb                                               x86_64                  1.2.2-1.el7                                  base                     131 k
 libnfsidmap                                          x86_64                  0.25-19.el7                                  base                      50 k
 libpath_utils                                        x86_64                  0.2.1-29.el7                                 base                      28 k
 libqb                                                x86_64                  1.0.1-6.el7                                  base                      95 k
 libref_array                                         x86_64                  0.1.5-29.el7                                 base                      26 k
 libtalloc                                            x86_64                  2.1.10-1.el7                                 base                      33 k
 libtdb                                               x86_64                  1.3.15-1.el7                                 base                      48 k
 libtevent                                            x86_64                  0.9.33-2.el7                                 base                      37 k
 libtirpc                                             x86_64                  0.2.4-0.10.el7                               base                      88 k
 libverto-libevent                                    x86_64                  0.2.5-4.el7                                  base                     8.9 k
 libwbclient                                          x86_64                  4.7.1-6.el7                                  base                     107 k
 libxslt                                              x86_64                  1.1.28-5.el7                                 base                     242 k
 libyaml                                              x86_64                  0.1.4-11.el7_0                               base                      55 k
 net-snmp-libs                                        x86_64                  1:5.7.2-33.el7_5.2                           updates                  749 k
 net-tools                                            x86_64                  2.0-0.22.20131004git.el7                     base                     305 k
 nettle                                               x86_64                  2.7.1-8.el7                                  base                     327 k
 nfs-utils                                            x86_64                  1:1.3.0-0.54.el7                             base                     407 k
 overpass-fonts                                       noarch                  2.1-1.el7                                    base                     700 k
 pacemaker-cli                                        x86_64                  1.1.18-11.el7_5.3                            updates                  348 k
 pacemaker-cluster-libs                               x86_64                  1.1.18-11.el7_5.3                            updates                  152 k
 pacemaker-libs                                       x86_64                  1.1.18-11.el7_5.3                            updates                  620 k
 perl-TimeDate                                        noarch                  1:2.30-2.el7                                 base                      52 k
 python-backports                                     x86_64                  1.0-8.el7                                    base                     5.8 k
 python-backports-ssl_match_hostname                  noarch                  3.5.0.1-1.el7                                base                      13 k
 python-clufter                                       noarch                  0.77.0-2.el7                                 base                     320 k
 python-ipaddress                                     noarch                  1.0.16-2.el7                                 base                      34 k
 python-lxml                                          x86_64                  3.2.1-4.el7                                  base                     758 k
 python-setuptools                                    noarch                  0.9.8-7.el7                                  base                     397 k
 quota                                                x86_64                  1:4.01-17.el7                                base                     179 k
 quota-nls                                            noarch                  1:4.01-17.el7                                base                      90 k
 resource-agents                                      x86_64                  3.9.5-124.el7                                base                     398 k
 rpcbind                                              x86_64                  0.2.0-44.el7                                 base                      59 k
 ruby                                                 x86_64                  2.0.0.648-33.el7_4                           base                      71 k
 ruby-irb                                             noarch                  2.0.0.648-33.el7_4                           base                      92 k
 ruby-libs                                            x86_64                  2.0.0.648-33.el7_4                           base                     2.8 M
 rubygem-bigdecimal                                   x86_64                  1.2.0-33.el7_4                               base                      83 k
 rubygem-io-console                                   x86_64                  0.4.2-33.el7_4                               base                      54 k
 rubygem-json                                         x86_64                  1.7.7-33.el7_4                               base                      79 k
 rubygem-psych                                        x86_64                  2.0.0-33.el7_4                               base                      82 k
 rubygem-rdoc                                         noarch                  4.0.0-33.el7_4                               base                     322 k
 rubygems                                             noarch                  2.0.14.1-33.el7_4                            base                     219 k
 samba-client-libs                                    x86_64                  4.7.1-6.el7                                  base                     4.8 M
 samba-common                                         noarch                  4.7.1-6.el7                                  base                     205 k
 samba-common-libs                                    x86_64                  4.7.1-6.el7                                  base                     162 k
 tcp_wrappers                                         x86_64                  7.6-77.el7                                   base                      78 k
 trousers                                             x86_64                  0.3.14-2.el7                                 base                     289 k

Transaction Summary
==========================================================================================================================================================

corosync設定のバックアップ

cp -a /etc/corosync/corosync.conf{.example.udpu,}

corosyncの設定を行います

cat <<'EOF' > /etc/corosync/corosync.conf
# Please read the corosync.conf.5 manual page
totem {
        version: 2

        # crypto_cipher and crypto_hash: Used for mutual node authentication.
        # If you choose to enable this, then do remember to create a shared
        # secret with "corosync-keygen".
        # enabling crypto_cipher, requires also enabling of crypto_hash.
        crypto_cipher: none
        crypto_hash: none

        # interface: define at least one interface to communicate
        # over. If you define more than one interface stanza, you must
        # also set rrp_mode.
        interface {
                # Rings must be consecutively numbered, starting at 0.
                ringnumber: 0
                # This is normally the *network* address of the
                # interface to bind to. This ensures that you can use
                # identical instances of this configuration file
                # across all your cluster nodes, without having to
                # modify this option.
                bindnetaddr: 10.44.194.0
                # However, if you have multiple physical network
                # interfaces configured for the same subnet, then the
                # network address alone is not sufficient to identify
                # the interface Corosync should bind to. In that case,
                # configure the *host* address of the interface
                # instead:
                # bindnetaddr: 192.168.1.0
                # When selecting a multicast address, consider RFC
                # 2365 (which, among other things, specifies that
                # 239.255.x.x addresses are left to the discretion of
                # the network administrator). Do not reuse multicast
                # addresses across multiple Corosync clusters sharing
                # the same network.
                #mcastaddr: 239.255.1.1
                # Corosync uses the port you specify here for UDP
                # messaging, and also the immediately preceding
                # port. Thus if you set this to 5405, Corosync sends
                # messages over UDP ports 5405 and 5404.
                mcastport: 5405
                # Time-to-live for cluster communication packets. The
                # number of hops (routers) that this ring will allow
                # itself to pass. Note that multicast routing must be
                # specifically enabled on most network routers.
                ttl: 1
        }
        transport: udpu
}

logging {
        # Log the source file and line where messages are being
        # generated. When in doubt, leave off. Potentially useful for
        # debugging.
        fileline: off
        # Log to standard error. When in doubt, set to no. Useful when
        # running in the foreground (when invoking "corosync -f")
        to_stderr: no
        # Log to a log file. When set to "no", the "logfile" option
        # must not be set.
        to_logfile: yes
        logfile: /var/log/cluster/corosync.log
        # Log to the system log daemon. When in doubt, set to yes.
        to_syslog: no
        # Log debug messages (very verbose). When in doubt, leave off.
        debug: off
        # Log messages with time stamps. When in doubt, set to on
        # (unless you are only logging to syslog, where double
        # timestamps can be annoying).
        timestamp: on
        logger_subsys {
                subsys: QUORUM
                debug: off
        }
}

nodelist {
        node {
                ring0_addr: 10.44.194.86
                nodeid: 1
        }

        node {
                ring0_addr: 10.44.194.87
                nodeid: 2
        }
}

quorum {
        # Enable and configure quorum subsystem (default: off)
        # see also corosync.conf.5 and votequorum.5
        provider: corosync_votequorum
        expected_votes: 2
}
EOF

pacemaker起動 (corosyncはpacemaker経由で起動)

systemctl restart pacemaker
systemctl enable pacemaker

pacemaker の status を確認します

[root@haproxy1 ~]# pcs status
Cluster name: 
WARNING: no stonith devices and stonith-enabled is not false
WARNING: corosync and pacemaker node names do not match (IPs used in setup?)
Stack: corosync
Current DC: haproxy2.localdomain (version 1.1.18-11.el7_5.3-2b07d5c5a9) - partition with quorum
Last updated: Sat Jul 28 23:31:24 2018
Last change: Sat Jul 28 23:31:09 2018 by hacluster via crmd on haproxy2.localdomain

2 nodes configured
0 resources configured

Online: [ haproxy1.localdomain haproxy2.localdomain ]

No resources


Daemon Status:
  corosync: active/disabled
  pacemaker: active/enabled
  pcsd: inactive/disabled

その他、crm などの状態を確認します

crm_mon -A

実行例

Stack: corosync
Current DC: haproxy2.localdomain (version 1.1.18-11.el7_5.3-2b07d5c5a9) - partition with quorum
Last updated: Sat Jul 28 23:32:19 2018
Last change: Sat Jul 28 23:31:09 2018 by hacluster via crmd on haproxy2.localdomain

2 nodes configured
0 resources configured

Online: [ haproxy1.localdomain haproxy2.localdomain ]

No active resources


Node Attributes:
* Node haproxy1.localdomain:
* Node haproxy2.localdomain:

STONITHを無効

pcs property set stonith-enabled=false

スプリットブレインが発生してもクォーラムが特別な動作を行わないように設定

pcs property set no-quorum-policy=ignore

確認

pcs property

実行例

[root@haproxy2 ~]# pcs property
Cluster Properties:
 cluster-infrastructure: corosync
 dc-version: 1.1.18-11.el7_5.3-2b07d5c5a9
 have-watchdog: false
 no-quorum-policy: ignore
 stonith-enabled: false

PacemakerでVIPを作成します。netmaskは32に指定した方が安全と思います

pcs resource create vip ocf:heartbeat:IPaddr2 ip=10.44.194.88 cidr_netmask=32 op monitor interval=5s

resource確認

# pcs resource
 vip    (ocf::heartbeat:IPaddr2):       Started sugi-log-haproxy1.localdomain

haproxyのsystemdをcloneとして追加します。cloneとして追加することで、2個のサーバ上でそれぞれHAProxyが動作します。
実際のアクセスはVIPを介してアクセスしますので、意味はないのですが、念のため両方で立ち上げさせておきます。

pcs resource create haproxy_service systemd:haproxy op monitor interval=5s clone

resource確認

[root@haproxy1 ~]# pcs resource
 vip    (ocf::heartbeat:IPaddr2):       Started haproxy1.localdomain
 Clone Set: haproxy_service-clone [haproxy_service]
     Started: [ haproxy1.localdomain haproxy2.localdomain ]

依存関係を設定します。

vip は haproxy が起動してないと起動しない

pcs constraint colocation add vip with haproxy_service-clone score=INFINITY

依存関係を確認します

[root@haproxy1 ~]# pcs constraint colocation show
Colocation Constraints:
  vip with haproxy_service-clone (score:INFINITY)

起動順を設定します。

haproxy起動後 → vip起動

pcs constraint order start haproxy_service-clone then start vip kind=Mandatory

起動順序確認

[root@haproxy1 ~]# pcs constraint order show
Ordering Constraints:
  start haproxy_service-clone then start vip (kind:Mandatory)

状態確認

[root@haproxy1 ~]# pcs status
Cluster name: 
WARNING: corosync and pacemaker node names do not match (IPs used in setup?)
Stack: corosync
Current DC: haproxy2.localdomain (version 1.1.18-11.el7_5.3-2b07d5c5a9) - partition with quorum
Last updated: Sat Jul 28 23:37:49 2018
Last change: Sat Jul 28 23:37:24 2018 by root via cibadmin on haproxy1.localdomain

2 nodes configured
3 resources configured

Online: [ haproxy1.localdomain haproxy2.localdomain ]

Full list of resources:

 vip    (ocf::heartbeat:IPaddr2):       Started haproxy1.localdomain
 Clone Set: haproxy_service-clone [haproxy_service]
     Started: [ haproxy1.localdomain haproxy2.localdomain ]

Daemon Status:
  corosync: active/disabled
  pacemaker: active/enabled
  pcsd: inactive/disabled

HAProxy(VIP)から、NodePortへアクセステスト

Kubernetesクラスタで、以下のServiceを稼働しています

root@ntw-k8s-master01(default kubernetes-admin):~# kubectl get svc -o wide --all-namespaces
NAMESPACE     NAME                   TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)          AGE       SELECTOR
default       kubernetes             ClusterIP   10.96.0.1        <none>        443/TCP          1d        <none>
default       nginx-test             NodePort    10.98.232.82     <none>        8080:32003/TCP   20h       app=nginx-test
kube-system   calico-etcd            ClusterIP   10.96.232.136    <none>        6666/TCP         1d        k8s-app=calico-etcd
kube-system   kube-dns               ClusterIP   10.96.0.10       <none>        53/UDP,53/TCP    1d        k8s-app=kube-dns
kube-system   kubernetes-dashboard   NodePort    10.103.153.119   <none>        80:32002/TCP     20h       k8s-app=kubernetes-dashboard

nginxのNodePortへ、HAProxyのVIPを経由してアクセスします。NodePortのPort番号「32003」を記録しておいて、VIP:32003 でアクセスを行います。
正常にNodePortへアクセス出来る事を確認出来ます。

# curl http://10.44.194.88:32003
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
    body {
        width: 35em;
        margin: 0 auto;
        font-family: Tahoma, Verdana, Arial, sans-serif;
    }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>

障害試験 (HAProxy)

Pacemakerで冗長化されているHAProxyを障害してみた時に、アクセスへの影響を確認します。

Client側で以下の bash script を準備する

cat <<'EOF' > /root/http_ping.sh
# /bin/bash
function http_ping(){
    TARGET_HOST=$1
    TARGET_URL=$2
    MAX_TIMEOUT=1
    CON_TIMEOUT=1

    curl -w "[`date '+%Y/%m/%d %H:%m:%S.%3N '`] http_status=%{http_code}  total=%{time_total}\n" \
            -o tmp\
            -s --connect-timeout ${CON_TIMEOUT} --max-time ${MAX_TIMEOUT} --header "Host: ${TARGET_HOST}" ${TARGET_URL}
}
EOF

client側で以下を実行しておきます

source /root/http_ping.sh
while true;
do
    http_ping 10.44.194.88 http://10.44.194.88:32003
    sleep 0.1
done

HAProxyはKVM上で仮想マシンとして稼働しているため、以下コマンドで強制停止します

date; virsh destroy haproxy1; date;

client側での例

[2018/07/29 00:07:59.208 ] http_status=200  total=0.002
[2018/07/29 00:07:59.319 ] http_status=200  total=0.002
[2018/07/29 00:07:59.430 ] http_status=200  total=0.001
[2018/07/29 00:07:59.541 ] http_status=000  total=1.002 <------------- VIPが切り替わるまでの通信断
[2018/07/29 00:07:00.653 ] http_status=000  total=1.001 <------------- VIPが切り替わるまでの通信断
[2018/07/29 00:07:01.764 ] http_status=000  total=1.001 <------------- VIPが切り替わるまでの通信断
[2018/07/29 00:07:02.875 ] http_status=200  total=0.004
[2018/07/29 00:07:02.989 ] http_status=200  total=0.002
[2018/07/29 00:07:03.102 ] http_status=200  total=0.002
[2018/07/29 00:07:03.215 ] http_status=200  total=0.002
[2018/07/29 00:07:03.328 ] http_status=200  total=0.002
[2018/07/29 00:07:03.441 ] http_status=200  total=0.001

haproxy2側でPacemakerを確認します。1側がStopになっており、VIPが移動していることがわかります。

[root@haproxy2 ~]# pcs status
Cluster name: 
WARNING: corosync and pacemaker node names do not match (IPs used in setup?)
Stack: corosync
Current DC: haproxy2.localdomain (version 1.1.18-11.el7_5.3-2b07d5c5a9) - partition WITHOUT quorum
Last updated: Sun Jul 29 00:07:37 2018
Last change: Sat Jul 28 23:37:24 2018 by root via cibadmin on haproxy1.localdomain

2 nodes configured
3 resources configured

Online: [ haproxy2.localdomain ]
OFFLINE: [ haproxy1.localdomain ]

Full list of resources:

 vip    (ocf::heartbeat:IPaddr2):       Started haproxy2.localdomain
 Clone Set: haproxy_service-clone [haproxy_service]
     Started: [ haproxy2.localdomain ]
     Stopped: [ haproxy1.localdomain ]

Daemon Status:
  corosync: active/disabled
  pacemaker: active/enabled
  pcsd: inactive/disabled

次の試験のため1側を起動します

virsh start haproxy1

障害試験 (Node)

次に、HAProxyのcheck機能が正常に動作しているか確認するために、node01を強制停止しましょう。

HAProxy側で socat コマンドをインストールし、下記コマンドを実行すると、バックエンドServer群の状態が分かる

[root@haproxy1 ~]# echo 'show stat -1 4 -1' | socat stdio unix-connect:/var/lib/haproxy/stats | grep -v -e '^#\|^$' | cut -d ',' -f 2,18 --output-delimiter=':'
node01:UP
node02:UP

client側で以下を実行しておきます

source /root/http_ping.sh
while true;
do
    http_ping 10.44.194.88 http://10.44.194.88:32003
    sleep 0.1
done

KVM上で仮想マシンとして稼働しているため、以下コマンドで強制停止します

date; virsh destroy ntw-k8s-nodevm01; date;

client側での例

[2018/07/29 03:07:55.591 ] http_status=200  total=0.002
[2018/07/29 03:07:55.704 ] http_status=200  total=0.002
[2018/07/29 03:07:55.817 ] http_status=200  total=0.001
[2018/07/29 03:07:55.930 ] http_status=200  total=0.002
[2018/07/29 03:07:56.044 ] http_status=200  total=0.002
[2018/07/29 03:07:56.157 ] http_status=200  total=0.002
[2018/07/29 03:07:56.270 ] http_status=200  total=0.001
[2018/07/29 03:07:56.381 ] http_status=200  total=0.002
[2018/07/29 03:07:56.494 ] http_status=200  total=0.002
[2018/07/29 03:07:56.607 ] http_status=000  total=1.001 <------------- HAProxyがDownと判断するまで通信出来ない時が有る
[2018/07/29 03:07:57.724 ] http_status=200  total=0.005
[2018/07/29 03:07:57.854 ] http_status=000  total=1.002 <------------- HAProxyがDownと判断するまで通信出来ない時が有る
[2018/07/29 03:07:58.970 ] http_status=200  total=0.002
[2018/07/29 03:07:59.081 ] http_status=000  total=1.001 <------------- HAProxyがDownと判断するまで通信出来ない時が有る
[2018/07/29 03:07:00.194 ] http_status=200  total=0.002
[2018/07/29 03:07:00.306 ] http_status=000  total=1.002 <------------- HAProxyがDownと判断するまで通信出来ない時が有る
[2018/07/29 03:07:01.418 ] http_status=200  total=0.003
[2018/07/29 03:07:01.535 ] http_status=000  total=1.001 <------------- HAProxyがDownと判断するまで通信出来ない時が有る
[2018/07/29 03:07:02.648 ] http_status=000  total=1.001 <------------- HAProxyがDownと判断するまで通信出来ない時が有る
[2018/07/29 03:07:03.759 ] http_status=000  total=1.001 <------------- HAProxyがDownと判断するまで通信出来ない時が有る
[2018/07/29 03:07:04.871 ] http_status=200  total=0.002
[2018/07/29 03:07:04.983 ] http_status=200  total=0.002
[2018/07/29 03:07:05.094 ] http_status=200  total=0.001
[2018/07/29 03:07:05.205 ] http_status=200  total=0.002
[2018/07/29 03:07:05.317 ] http_status=200  total=0.001
[2018/07/29 03:07:05.428 ] http_status=200  total=0.002
[2018/07/29 03:07:05.541 ] http_status=200  total=0.001
[2018/07/29 03:07:05.652 ] http_status=200  total=0.001
[2018/07/29 03:07:05.762 ] http_status=200  total=0.002
[2018/07/29 03:07:05.875 ] http_status=200  total=0.002
[2018/07/29 03:07:05.986 ] http_status=200  total=0.002
[2018/07/29 03:07:06.098 ] http_status=200  total=0.001
[2018/07/29 03:07:06.210 ] http_status=200  total=0.001
[2018/07/29 03:07:06.322 ] http_status=200  total=0.001
[2018/07/29 03:07:06.434 ] http_status=200  total=0.001
[2018/07/29 03:07:06.546 ] http_status=200  total=0.001
[2018/07/29 03:07:06.658 ] http_status=200  total=0.001
[2018/07/29 03:07:06.770 ] http_status=200  total=0.001
[2018/07/29 03:07:06.881 ] http_status=200  total=0.001
[2018/07/29 03:07:06.993 ] http_status=200  total=0.001
[2018/07/29 03:07:07.105 ] http_status=200  total=0.001
[2018/07/29 03:07:07.216 ] http_status=200  total=0.001
[2018/07/29 03:07:07.328 ] http_status=200  total=0.001
[2018/07/29 03:07:07.439 ] http_status=200  total=0.002

HAProxyで状態を確認すると、node01がDOWNとなっている

[root@haproxy1 ~]# echo 'show stat -1 4 -1' | socat stdio unix-connect:/var/lib/haproxy/stats | grep -v -e '^#\|^$' | cut -d ',' -f 2,18 --output-delimiter=':'
node01:DOWN
node02:UP

まとめ

HAProxy + Pacemaker + DaemonSet を利用して、高可用性のPodの公開方法を確認することができました。
しかし、NodePortでは30000-32767といった、高いPort番号を採番してアクセスする方法となります。
オンプレミスで使用する場合、かつ社内公開用の基盤であれば、高いPortを使用してアクセスさせるのも問題ないと思いますが、

インターネットへ外部公開する場合は、:80 で公開させたいと思うので、他の手段を使用してPodを公開する方法を検討した方が良いと思います。

また、HAProxyのパフォーマンスに不安が有る場合は、物理LBのアプライアンスでも採用出来る構成のはずです。

参考URL

Official

Web

HAProxyのcheck挙動の詳細

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up

オンプレミスで高可用性Pod公開方法 その1 (NodePort + HAProxy + Pacemaker)