More than 5 years have passed since last update.

ローカルのKVM-VM上でCoreOSでクラスタ組んだメモ

Last updated at 2015-03-20Posted at 2014-09-29

Vagrant/VirtualBoxやAWS等のパブリッククラウド環境での情報はいろいろあったんですが、ローカルのVMで構築する場合の情報が少なかったのでまとめておきます。間違いや他にもっといい方法があれば、是非コメントください。

ここではCoreOS/etcd/fleetの詳細は説明しないので、mopemopeさんのhttp://qiita.com/mopemope/items/fa9424b094aae3eac580
などを参照して下さい。

構築環境

下記KVMホスト１台の上に、４VMを作成、CoreOSをインストールし、etcd/fleetのクラスタを構築した。

KVMホスト: CentOS6.5, openQRM5.1
CoreOS: 410.1.0 STABLE, etcd version 0.4.6, fleet version 0.6.2

CoreOSインストール

使用した環境では既にopenQRMが管理するdhcp/pxeサーバが稼働しており、Booting CoreOS via PXEは使えないので、Booting CoreOS from an ISOからISOイメージをダウンロードし、ISOイメージからVMを起動した。
vnc等でコンソールを開くとcoreユーザでログインできるが、vncでは何かと不便なので、PCからsshできるように仮にパスワード設定を行う。

$ sudo passwd core

起動した状態ではメモリ上にシステムが展開されており、ローカルディスクにCoreOSをインストールしてやる必要があるので、Installing CoreOS to Diskに従い、まずcloud-configファイルを作成する。
作成したcloud-configファイルは、coreユーザのホームディレクトリ直下に cloud-config.yaml と名付けて保存し、下記のコマンドを実行する。

$ sudo coreos-install -d /dev/vda -C stable -c ~/cloud-config.yaml

-d /dev/vda は、使用環境に合わせた起動ディスクのデバイス名(/dev/sda等)を指定する。
-C stable は、インストールするCoreOSのチャネル(stable/beta/alpha)を指定する。

インストールしたVMイメージをクローンした場合、 machine-id が重複してfleetが動かなくなるので、 machine-id ファイルを削除して再起動する必要がある。

$ sudo rm /etc/machine-id
$ sudo systemctl reboot

公式ドキュメントには「ゴールデンイメージ」を作成する場合は、/usr/share/OEM/デイレクトリを使用するべきとか書いてあったが、調べきれず。

cloud-configファイルの記述方法

Using Cloud-Configにあるように、cloud-configファイルは必ず #cloud-config を含み、下記のようなキーを必要に応じて定義する。

coreos
ssh_authorized_keys
hostname
users
units
write_files
manage_etc_hosts

(1) ssh設定

絶対に設定する必要があるのは、 ssh_authorized_keys と users である。
CoreOSはデフォルトでsshが公開鍵認証になっているので、ちゃんと設定しないと起動しても接続できないでくのぼうVMになる。
公開鍵認証を使う場合は、ssh-keygenコマンド等で作成した公開鍵の内容を下記のようにcloud-configファイルに記載する。

# cloud-config

users:
  - name: core
ssh_authorized_keys:
  - ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAweXfI2dW0jrWU....

パスワードを使用する場合は、ハッシュ化されたパスワードを作成し、

core@core001 ~ $ openssl passwd -1
Password:
Verifying - Password:
$1$IlTQkYPg$fndQiTnrMN.....

下記のようにcloud-configファイルに記載する。

# cloud-config

users:
  - name: core
    passwd: $1$IlTQkYPg$fndQiTnrMN.....
ssh_authorized_keys:
  - ssh-rsa AAAAB3NzaC1yc2....

(2) etcd設定

公式ドキュメントの例を見ると

# cloud-config

coreos:
    etcd:
        name: node001
    # generate a new token for each unique cluster from https://discovery.etcd.io/new
        discovery: https://discovery.etcd.io/<token>
    # multi-region and multi-cloud deployments need to use $public_ipv4
        addr: $public_ipv4:4001
        peer-addr: $private_ipv4:7001

のようにさらっと書いてあるが、ここでハマったので詳しく書いてみよう。
name: ノード名を指定したい場合はここに書くが、指定しなければ勝手に生成される。
discovery: etcdは、システム起動時に所属するクラスタメンバを data-dir設定で指定されたログデータ、 discovery設定、 peers設定の順で検索する。（Cluster Finding Process）
data-dir設定は、

$./etcd -name <name> [-data-dir=<path>]

のようにコマンドラインでetcd起動する場合に使用できるが、discovery設定、またはpeers設定をしていなければ、ノード追加時に既存クラスタに参加できない。
discovery: キーに記載するのは、etcdクラスタメンバ以外のetcdサーバがdiscoveryサービスを提供している場合、そのURLを記載する。discoveryサービスを使用するメリットは、クラスタ追加／削除時の各クラスタメンバ個別の設定が不要である点だが、クラスタメンバ外にetcdサーバが必要であり、本番系では障害対策を考慮する必要がある。 https://discovery.etcd.io/ のサービスはグローバルアドレスを持ったetcdクラスタでないと使用できない。今回はdiscoveryサービス用にetcdサーバを一台たてて、以下のように指定した。

discovery: http://192.168.0.1:4001/v2/keys/test_cluster

test_cluster はクラスタ識別のための任意のキー名で良い。
peers設定は、discoveryサービスを使用しないで、クラスタメンバを明示的に指定でき、etcdクラスタメンバ以外のサーバは必要ない。しかしクラスタメンバ全てで個別の設定が必要なため、数台程度のクラスタであれば設定は簡単だが、大規模になると運用的に無理であろう。設定は、自分自身以外のクラスタメンバを下記のようにコンマでつないで指定する。

peers: 192.168.0.2:7001,192.168.0.3:7001

addr: と peer-addr: 　etcd通信のために稼働ノードのIPアドレスとポートを指定する。$public_ipv4と$private_ipv4は、 Amazon EC2, Google Compute Engine, OpenStack, Rackspace, DigitalOcean, Vagrantでしかサポートされていない環境変数のため、今回は直接IPアドレスを記載した。

# cloud-config

coreos:
  etcd:
    discovery: http://192.168.0.1:4001/v2/keys/test_cluster
    addr: 192.168.0.2:4001
    peer-addr: 192.168.0.2:7001

cloud-configファイルをノード毎に変更したくないが、CoreOSのCloud-initは bootcmd や runcmd をまだサポートしていない。 units で service を作ってごにょごにょすれば変数を使ってなんとかできそうな気がするが、今回はできなかった。

(3) fleet設定

fleetではコンテナのスケジューリングをいろいろ設定可能であるが（Deploying fleet）今回は特別な設定をしないので記載しない。

(4) units設定

CoreOS起動後に立ち上げるサービス（systemdのunits、以前で言えば/etc/init.d/の設定）を記載する。

# cloud-config

  units:
    - name: etcd.service
      command: start
    - name: fleet.service
      command: start
    - name: docker-tcp.socket
      command: start
      enable: true
      content: |
        [Unit]
        Description=Docker Socket for the API

        [Socket]
        ListenStream=2375
        Service=docker.service
        BindIPv6Only=both

        [Install]
        WantedBy=sockets.target
    - name: settimezone.service
      command: start
      content: |
        [Unit]
        Description=Set the timezone

        [Service]
        ExecStart=/usr/bin/timedatectl set-timezone Asia/Tokyo
        RemainAfterExit=yes
        Type=oneshot

- name: etcd.service, fleet.service, docker-tcp.socket はクラスタ組む場合必須、 settimezone.service はデフォルトのタイムゾーン(UTC)を変更する場合記載する。（Configuring Date and Timezone）
command:, enable:, content: 等はサービス起動のオプションなので、Using Cloud-Configを参照する。

(5) cloud-config.yaml

今回使用したファイルは下記の通り。これで最低限のクラスタを起動できた。

# cloud-config

coreos:
  etcd:
    discovery: http://[discoveryノードのIPアドレス]:4001/v2/keys/[クラスタ識別名]
    addr: [各ノードのIPアドレス]:4001
    peer-addr: [各ノードのIPアドレス]:7001
  units:
    - name: etcd.service
      command: start
    - name: fleet.service
      command: start
    - name: docker-tcp.socket
      command: start
      enable: true
      content: |
        [Unit]
        Description=Docker Socket for the API

        [Socket]
        ListenStream=2375
        Service=docker.service
        BindIPv6Only=both

        [Install]
        WantedBy=sockets.target
    - name: settimezone.service
      command: start
      content: |
        [Unit]
        Description=Set the timezone

        [Service]
        ExecStart=/usr/bin/timedatectl set-timezone Asia/Tokyo
        RemainAfterExit=yes
        Type=oneshot
users:
  - name: core
        passwd: $1$IlTQkYPg$fndQiTnrMN.....
ssh_authorized_keys:
  - ssh-rsa AAAAB3NzaC1yc2....

クラスタの確認

etcdの状態確認

$ systemctl status etcd
● etcd.service - etcd
   Loaded: loaded (/usr/lib64/systemd/system/etcd.service; disabled)
  Drop-In: /run/systemd/system/etcd.service.d
           └─20-cloudinit.conf
   Active: active (running) since Mon 2014-09-29 11:37:46 JST; 5h 10min ago
 Main PID: 618 (etcd)
   CGroup: /system.slice/etcd.service
           └─618 /usr/bin/etcd

Sep 29 11:37:47 core004 etcd[618]: [etcd] Sep 29 11:37:47.071 INFO      | core004: state changed from 'follower' to 'snapshotting'.
Sep 29 11:37:47 core004 etcd[618]: [etcd] Sep 29 11:37:47.130 INFO      | core004: peer added: 'core001'
Sep 29 11:37:47 core004 etcd[618]: [etcd] Sep 29 11:37:47.224 INFO      | core004: peer added: 'core003'
Sep 29 11:37:50 core004 etcd[618]: [etcd] Sep 29 11:37:50.271 INFO      | core004: snapshot of 1414179 events at index 1414179 completed

etcdの設定確認

$ systemctl cat etcd
# /usr/lib64/systemd/system/etcd.service
[Unit]
Description=etcd

[Service]
User=etcd
PermissionsStartOnly=true
Environment=ETCD_DATA_DIR=/var/lib/etcd ETCD_NAME=default
ExecStart=/usr/bin/etcd
Restart=always
RestartSec=10s

[Install]
WantedBy=multi-user.target

# /run/systemd/system/etcd.service.d/20-cloudinit.conf
[Service]
Environment="ETCD_ADDR=192.168.0.2:4001"
Environment="ETCD_DISCOVERY=http://192.168.0.1:4001/v2/keys/cluster1"
Environment="ETCD_NAME=core002"
Environment="ETCD_PEER_ADDR=192.168.0.2:7001"

etcdクラスタの確認

$ curl -L http://127.0.0.1:7001/v2/admin/machines
[{"name":"core002","state":"follower","clientURL":"http://192.168.0.2:4001","peerURL":"http://192.168.0.2:7001"},{"name":"core003","state":"leader","clientURL":"http://192.168.0.3:4001","peerURL":"http://192.168.0.3:7001"},{"name":"core004","state":"follower","clientURL":"http://192.168.0.4:4001","peerURL":"http://192.168.0.4:7001"}]

fleetメンバの確認

$ fleetctl list-machines -l
MACHINE					IP		METADATA
1ece947bc2084053988c0b65a7bf43e8	192.168.0.2	-
33f37f51181d46bcbf43e7162795e447	192.168.0.3	-
3dc1196b4ccd470d9f262ab506c4649c	192.168.0.4	-

エラーが出るようであれば、etcdを再起動する

$ sudo systemctl restart etcd

CoreOSインストール後のCloud-configファイルは /var/lib/coreos-install/user_data に保存されるので、設定変更する場合は編集保存後再起動を行う。

$ sudo vim /var/lib/coreos-install/user_data
$ sudo reboot

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up