More than 5 years have passed since last update.

株式会社IDCフロンティア

CoreOSでetcd,fleetやってみた

CoreOS

Last updated at 2014-11-27Posted at 2014-11-27

CoreOS が提供してくれるものが興味深かったので、自分の環境で、それぞれ最新のversionで試してみました。

環境

Macosx 10.9.5
Vagrant 1.6.5
VirtualBox 4.3.20
Golang 1.3.3

環境準備

etcd

上記を参考に、etcdを起動。
githubにIt is strongly recommended that users work with the latest 0.4.x release (0.4.6)とあるので、最新の0.5は避けて、0.4.6を使用。
etcdのoptionはetcd --helpで確認すると、Qiitaの情報とは変わっていた。-addr=で指定するaddressは172.17.8.1を使用。後で、vagrantでCoreOS起動したらvboxnet5としてこのaddressが使われていたので。

$ curl -L  https://github.com/coreos/etcd/releases/download/v0.4.6/etcd-v0.4.6-darwin-amd64.zip -o etcd-v 0.4.6-darwin-amd64.zip.
$ unzip etcd-v0.4.6-darwin-amd64.zip.
$ cd etcd-v0.4.6-darwin-amd64
ochi:etcd-v0.4.6-darwin-amd64
$ pochi:etcd-v0.4.6-darwin-amd64
$ ./etcd -addr=172.17.8.1:4001 -v
[etcd] Nov 26 07:09:58.687 WARNING   | Using the directory pochi.local.etcd as the etcd curation directory because a directory was not specified.
[etcd] Nov 26 07:09:58.690 DEBUG     | pochi.local finished load snapshot
[etcd] Nov 26 07:09:58.698 DEBUG     | URLs: /_etcd/machines:  / pochi.local (http://127.0.0.1:7001)
[etcd] Nov 26 07:09:58.698 INFO      | Peer URLs in log:  / pochi.local (http://127.0.0.1:7001)
[etcd] Nov 26 07:09:58.699 DEBUG     | pochi.local is restarting the cluster []
[etcd] Nov 26 07:09:58.700 INFO      | etcd server [name pochi.local, listen on :4001, advertised url http://172.17.8.1:4001]
[etcd] Nov 26 07:09:58.700 INFO      | peer server [name pochi.local, listen on :7001, advertised url http://127.0.0.1:7001]
[etcd] Nov 26 07:09:58.700 INFO      | pochi.local starting in peer mode
[etcd] Nov 26 07:09:58.700 INFO      | pochi.local: state changed from 'initialized' to 'follower'.
[etcd] Nov 26 07:09:58.933 INFO      | pochi.local: state changed from 'follower' to 'candidate'.
[etcd] Nov 26 07:09:58.933 INFO      | pochi.local: state changed from 'candidate' to 'leader'.
[etcd] Nov 26 07:09:58.933 INFO      | pochi.local: leader changed from '' to 'pochi.local'.
[etcd] Nov 26 07:09:59.701 DEBUG     | URLs: /_etcd/machines:  /  (pochi.local)
[etcd] Nov 26 07:10:00.703 DEBUG     | URLs: /_etcd/machines:  /  (pochi.local)
[etcd] Nov 26 07:10:01.700 DEBUG     | URLs: /_etcd/machines:  /  (pochi.local)
[etcd] Nov 26 07:10:01.704 INFO      | pochi.local: snapshot of 18908 events at index 18908 completed
[etcd] Nov 26 07:10:02.701 DEBUG     | URLs: /_etcd/machines:  /  (pochi.local)
[etcd] Nov 26 07:10:03.701 DEBUG     | URLs: /_etcd/machines:  /  (pochi.local)

$ curl http://172.17.8.1:4001/version
etcd 0.4.6

CoreOS Vagrant

久しぶりにvagrantさわったら、バージョンが1.3.xから1.6.xになってた。当然のように、errorがでるので、こちらを参考に、rm -r ~/.vagrant.d/plugins.json ~/.vagrant.d/gemsを実行。

そしてcp user-data.sample user-dataして、user-dataのdiscovery行を編集。元のsampleはhttpsになっているのでご注意。

# cloud-config

coreos:
  etcd:
    # generate a new token for each unique cluster from https://discovery.etcd.io/new
    # WARNING: replace each time you 'vagrant destroy'
    discovery: http://172.17.8.1:4001/v2/keys/machines
    addr: $public_ipv4:4001
    peer-addr: $public_ipv4:7001
  fleet:
    public-ip: $public_ipv4
  units:
    - name: etcd.service
      command: start
    - name: fleet.service
      command: start
    - name: docker-tcp.socket
      command: start
      enable: true
      content: |
        [Unit]
        Description=Docker Socket for the API

        [Socket]
        ListenStream=2375
        Service=docker.service
        BindIPv6Only=both

        [Install]
        WantedBy=sockets.target

無事vagrant upできた

Vagrantを1.3.5から1.6.3に上げたよ（Vagrant Cloudを使うようになった）

$ vagrant up
Bringing machine 'core-01' up with 'virtualbox' provider...
Bringing machine 'core-02' up with 'virtualbox' provider...
Bringing machine 'core-03' up with 'virtualbox' provider...
==> core-01: Importing base box 'coreos-alpha'...
==> core-01: Matching MAC address for NAT networking...
==> core-01: Checking if box 'coreos-alpha' is up to date...
==> core-01: Setting the name of the VM: coreos-vagrant_core-01_1416954842606_17056
==> core-01: Clearing any previously set network interfaces...
==> core-01: Preparing network interfaces based on configuration...
    core-01: Adapter 1: nat
    core-01: Adapter 2: hostonly
==> core-01: Forwarding ports...
    core-01: 22 => 2222 (adapter 1)
==> core-01: Running 'pre-boot' VM customizations...
==> core-01: Booting VM...
==> core-01: Waiting for machine to boot. This may take a few minutes...
    core-01: SSH address: 127.0.0.1:2222
    core-01: SSH username: core
    core-01: SSH auth method: private key
    core-01: Warning: Connection timeout. Retrying...
==> core-01: Machine booted and ready!
==> core-01: Setting hostname...
==> core-01: Configuring and enabling network interfaces...
==> core-01: Running provisioner: file...
==> core-01: Running provisioner: shell...
    core-01: Running: inline script
==> core-02: Importing base box 'coreos-alpha'...
==> core-02: Matching MAC address for NAT networking...
==> core-02: Checking if box 'coreos-alpha' is up to date...
==> core-02: Setting the name of the VM: coreos-vagrant_core-02_1416954865219_3677
==> core-02: Fixed port collision for 22 => 2222. Now on port 2200.
==> core-02: Clearing any previously set network interfaces...
==> core-02: Preparing network interfaces based on configuration...
    core-02: Adapter 1: nat
    core-02: Adapter 2: hostonly
==> core-02: Forwarding ports...
    core-02: 22 => 2200 (adapter 1)
==> core-02: Running 'pre-boot' VM customizations...
==> core-02: Booting VM...
==> core-02: Waiting for machine to boot. This may take a few minutes...
    core-02: SSH address: 127.0.0.1:2200
    core-02: SSH username: core
    core-02: SSH auth method: private key
    core-02: Warning: Connection timeout. Retrying...
==> core-02: Machine booted and ready!
==> core-02: Setting hostname...
==> core-02: Configuring and enabling network interfaces...
==> core-02: Running provisioner: file...
==> core-02: Running provisioner: shell...
    core-02: Running: inline script
==> core-03: Importing base box 'coreos-alpha'...
==> core-03: Matching MAC address for NAT networking...
==> core-03: Checking if box 'coreos-alpha' is up to date...
==> core-03: Setting the name of the VM: coreos-vagrant_core-03_1416954886820_7707
==> core-03: Fixed port collision for 22 => 2222. Now on port 2201.
==> core-03: Clearing any previously set network interfaces...
==> core-03: Preparing network interfaces based on configuration...
    core-03: Adapter 1: nat
    core-03: Adapter 2: hostonly
==> core-03: Forwarding ports...
    core-03: 22 => 2201 (adapter 1)
==> core-03: Running 'pre-boot' VM customizations...
==> core-03: Booting VM...
==> core-03: Waiting for machine to boot. This may take a few minutes...
    core-03: SSH address: 127.0.0.1:2201
    core-03: SSH username: core
    core-03: SSH auth method: private key
    core-03: Warning: Connection timeout. Retrying...
==> core-03: Machine booted and ready!
==> core-03: Setting hostname...
==> core-03: Configuring and enabling network interfaces...
==> core-03: Running provisioner: file...
==> core-03: Running provisioner: shell...
    core-03: Running: inline script```

vagrant statusで確認

$ vagrant status
Current machine states:

core-01                   running (virtualbox)
core-02                   running (virtualbox)
core-03                   running (virtualbox)

This environment represents multiple VMs. The VMs are all listed
above with their current state. For more information about a specific
VM, run `vagrant status NAME`.
pochi:coreos-vagrant snumano$

作成した1台目にsshしてfleetコマンドで確認。ok
他の2台を確認しても内容は同じ。

$ vagrant ssh core-01
CoreOS (alpha)
core@core-01 ~ $ fleetctl list-machines -l
MACHINE					IP		METADATA
6562797c17224c35acb666f52bad25e0	172.17.8.102	-
8bb4edab9b71401f9455ae2b0c63f6f1	172.17.8.103	-
c595df046ef940d1b0a384225c4d528c	172.17.8.101	-
core@core-01 ~ $

$ vagrant ssh core-02
CoreOS (alpha)
core@core-02 ~ $ fleetctl list-machines -l
MACHINE					IP		METADATA
6562797c17224c35acb666f52bad25e0	172.17.8.102	-
8bb4edab9b71401f9455ae2b0c63f6f1	172.17.8.103	-
c595df046ef940d1b0a384225c4d528c	172.17.8.101	-
core@core-02 ~ $

$ vagrant ssh core-03
CoreOS (alpha)
core@core-03 ~ $ fleetctl list-machines -l
MACHINE					IP		METADATA
6562797c17224c35acb666f52bad25e0	172.17.8.102	-
8bb4edab9b71401f9455ae2b0c63f6f1	172.17.8.103	-
c595df046ef940d1b0a384225c4d528c	172.17.8.101	-
core@core-03 ~ $

これでCoreOSクラスタ構築完了

Fleetでdeploy

fleet で CoreOS + Vagrant の Cluster にデプロイする

上記を参考に、macにFleetをinstall。macから下記のようにfleetctlコマンドでCoreOSクラスタを確認。endpointは先に設定したetcdのdiscovery。

$ cd <fleet dir>
$ bin/fleetctl --endpoint 'http://172.17.8.101:4001' list-machines -l
MACHINE					IP		METADATA
6562797c17224c35acb666f52bad25e0	172.17.8.102	-
8bb4edab9b71401f9455ae2b0c63f6f1	172.17.8.103	-
c595df046ef940d1b0a384225c4d528c	172.17.8.101	-

後の手順をあわせるために、環境変数FLEETCTL_ENDPOINTを設定。

$ export FLEETCTL_ENDPOINT=http://172.17.8.101:4001
$ bin/fleetctl list-machines -l
MACHINE					IP		METADATA
6562797c17224c35acb666f52bad25e0	172.17.8.102	-
8bb4edab9b71401f9455ae2b0c63f6f1	172.17.8.103	-
c595df046ef940d1b0a384225c4d528c	172.17.8.101	-

unitの追加は、Qiitaの順番では上手く行かなかったので、githubを参考にした。
2台目(.102)のCoreOSでサービスが起動していることがわかる。

$ bin/fleetctl start examples/hello.service
Unit hello.service launched on 2ae5a2c5.../172.17.8.101
$ bin/fleetctl list-units
UNIT		MACHINE				ACTIVE	SUB
hello.service	2ae5a2c5.../172.17.8.101	active	running
$ bin/fleetctl list-unit-files
UNIT		HASH	DSTATE		STATE		TARGET
hello.service	e55c0ae	launched	launched	2ae5a2c5.../172.17.8.101

ここのRemote Fleet accessのvagrantを参考にした。

$ vagrant ssh-config
Host core-01
  HostName 127.0.0.1
  User core
  Port 2222
  UserKnownHostsFile /dev/null
  StrictHostKeyChecking no
  PasswordAuthentication no
  IdentityFile /Users/snumano/.vagrant.d/insecure_private_key
  IdentitiesOnly yes
  LogLevel FATAL

Host core-02
  HostName 127.0.0.1
  User core
  Port 2200
  UserKnownHostsFile /dev/null
  StrictHostKeyChecking no
  PasswordAuthentication no
  IdentityFile /Users/snumano/.vagrant.d/insecure_private_key
  IdentitiesOnly yes
  LogLevel FATAL

Host core-03
  HostName 127.0.0.1
  User core
  Port 2201
  UserKnownHostsFile /dev/null
  StrictHostKeyChecking no
  PasswordAuthentication no
  IdentityFile /Users/snumano/.vagrant.d/insecure_private_key
  IdentitiesOnly yes
  LogLevel FATAL

$ vagrant ssh-config | sed -n "s/IdentityFile//gp" | xargs ssh-add
Identity added: /Users/snumano/.vagrant.d/insecure_private_key (/Users/snumano/.vagrant.d/insecure_private_key)
Identity added: /Users/snumano/.vagrant.d/insecure_private_key (/Users/snumano/.vagrant.d/insecure_private_key)
Identity added: /Users/snumano/.vagrant.d/insecure_private_key (/Users/snumano/.vagrant.d/insecure_private_key)
$ export FLEETCTL_TUNNEL="$(vagrant ssh-config | sed -n "s/[ ]*HostName[ ]*//gp"):$(vagrant ssh-config | sed -n "s/[ ]*Port[ ]*//gp")"
$FLEETCTL_TUNNEL
127.0.0.1 127.0.0.1 127.0.0.1:2222 2200 2201

サービスの確認。

$ bin/fleetctl status hello.service
● hello.service - Hello World
   Loaded: loaded (/run/fleet/units/hello.service; linked-runtime)
   Active: active (running) since Wed 2014-11-26 21:47:05 UTC; 3min 44s ago
 Main PID: 1335 (bash)
   CGroup: /system.slice/hello.service
           ├─1335 /bin/bash -c while true; do echo "Hello, world"; sleep 1; done
           └─1609 sleep 1

Nov 26 21:50:40 core-01 bash[1335]: Hello, world
Nov 26 21:50:41 core-01 bash[1335]: Hello, world
Nov 26 21:50:42 core-01 bash[1335]: Hello, world
Nov 26 21:50:43 core-01 bash[1335]: Hello, world
Nov 26 21:50:44 core-01 bash[1335]: Hello, world
Nov 26 21:50:45 core-01 bash[1335]: Hello, world
Nov 26 21:50:46 core-01 bash[1335]: Hello, world
Nov 26 21:50:47 core-01 bash[1335]: Hello, world
Nov 26 21:50:48 core-01 bash[1335]: Hello, world
Nov 26 21:50:49 core-01 bash[1335]: Hello, world

サービスが起動しているCoreOSにssh接続。
core-01に接続している事がわかる。

$ bin/fleetctl list-unit-files
UNIT		HASH	DSTATE		STATE		TARGET
hello.service	e55c0ae	launched	launched	2ae5a2c5.../172.17.8.101
$ bin/fleetctl list-units
UNIT		MACHINE				ACTIVE	SUB
hello.service	2ae5a2c5.../172.17.8.101	active	running
$ bin/fleetctl ssh hello.service
Last login: Wed Nov 26 21:50:50 2014 from 172.17.8.1
CoreOS (alpha)
core@core-01 ~ $

failoverを試す

fleet で CoreOS + Vagrant の Cluster でフェイルオーバーを確認する

上記をそのままに動作確認できました。

$ bin/fleetctl list-machines -l
MACHINE					IP		METADATA
2ae5a2c5f8d145a2a10a5b6ca44e5ac2	172.17.8.101	-
5571847dfe604f56a241aedaef1fa134	172.17.8.102	-
60c954604b86414888ba2051bdadfe1b	172.17.8.103	-
$ bin/fleetctl list-unit-files
UNIT		HASH	DSTATE		STATE		TARGET
hello.service	e55c0ae	launched	inactive	5571847d.../172.17.8.102
$ bin/fleetctl list-units
UNIT		MACHINE				ACTIVE	SUB
hello.service	5571847d.../172.17.8.102	active	running

別のterminalでcore-02を停止

$ vagrant suspend core-02
==> core-02: Saving VM state and suspending execution...
pochi:coreos-vagrant snumano$ vagrant status
Current machine states:

core-01                   running (virtualbox)
core-02                   saved (virtualbox)
core-03                   running (virtualbox)```

core-02がなくなっているのがわかる。ただし、サービスはcore-01で動作している

$ bin/fleetctl list-machines -l
MACHINE					IP		METADATA
2ae5a2c5f8d145a2a10a5b6ca44e5ac2	172.17.8.101	-
60c954604b86414888ba2051bdadfe1b	172.17.8.103	-
$ bin/fleetctl list-unit-files
UNIT		HASH	DSTATE		STATE		TARGET
hello.service	e55c0ae	launched	launched	2ae5a2c5.../172.17.8.101
$ bin/fleetctl list-units
UNIT		MACHINE				ACTIVE	SUB
hello.service	2ae5a2c5.../172.17.8.101	active	running
$ bin/fleetctl status hello.service
● hello.service - Hello World
   Loaded: loaded (/run/fleet/units/hello.service; linked-runtime)
   Active: active (running) since Wed 2014-11-26 21:55:55 UTC; 32s ago
 Main PID: 1925 (bash)
   CGroup: /system.slice/hello.service
           ├─1925 /bin/bash -c while true; do echo "Hello, world"; sleep 1; done
           └─1958 sleep 1

Nov 26 21:56:18 core-01 bash[1925]: Hello, world
Nov 26 21:56:19 core-01 bash[1925]: Hello, world
Nov 26 21:56:20 core-01 bash[1925]: Hello, world
Nov 26 21:56:21 core-01 bash[1925]: Hello, world
Nov 26 21:56:22 core-01 bash[1925]: Hello, world
Nov 26 21:56:23 core-01 bash[1925]: Hello, world
Nov 26 21:56:24 core-01 bash[1925]: Hello, world
Nov 26 21:56:25 core-01 bash[1925]: Hello, world
Nov 26 21:56:26 core-01 bash[1925]: Hello, world
Nov 26 21:56:27 core-01 bash[1925]: Hello, world

ハマりポイント

vagrantで作成したクラスタをvagrant destroyで削除後、再度vagrant upで作成した場合、fleetコマンドが正常動作せず、errorを出しました。

$ bin/fleetctl --endpoint 'http://172.17.8.101:4001' list-machines -l
2014/11/26 23:49:22 INFO client.go:291: Failed getting response from http://172.17.8.101:4001/: dial tcp 172.17.8.101:4001: connection refused
2014/11/26 23:49:22 ERROR client.go:213: Unable to get result for {Get /_coreos.com/fleet/machines}, retrying in 100ms
2014/11/26 23:49:22 INFO client.go:291: Failed getting response from http://172.17.8.101:4001/: dial tcp 172.17.8.101:4001: connection refused
2014/11/26 23:49:22 ERROR client.go:213: Unable to get result for {Get /_coreos.com/fleet/machines}, retrying in 200ms
2014/11/26 23:49:22 INFO client.go:291: Failed getting response from http://172.17.8.101:4001/: dial tcp 172.17.8.101:4001: connection refused
2014/11/26 23:49:22 ERROR client.go:213: Unable to get result for {Get /_coreos.com/fleet/machines}, retrying in 400ms
2014/11/26 23:49:22 INFO client.go:291: Failed getting response from http://172.17.8.101:4001/: dial tcp 172.17.8.101:4001: connection refused
2014/11/26 23:49:22 ERROR client.go:213: Unable to get result for {Get /_coreos.com/fleet/machines}, retrying in 800ms
---snip---

原因は不明ですが、etcdのdirの中に作成されていた.etcdというdirを削除して数十秒後に、fleetctlが正常動作するようになりました。

$ pwd
<hogehoge>/etcd-v0.4.6-darwin-amd64
$ ls
README-etcd.md		etcd			pochi.local.etcd
README-etcdctl.md	etcdctl
$ rm -fr pochi.local.etcd

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up