CoreOS が提供してくれるものが興味深かったので、自分の環境で、それぞれ最新のversionで試してみました。
環境
- Macosx 10.9.5
- Vagrant 1.6.5
- VirtualBox 4.3.20
- Golang 1.3.3
環境準備
etcd
上記を参考に、etcdを起動。
githubにIt is strongly recommended that users work with the latest 0.4.x release (0.4.6)
とあるので、最新の0.5は避けて、0.4.6を使用。
etcdのoptionはetcd --help
で確認すると、Qiitaの情報とは変わっていた。-addr=
で指定するaddressは172.17.8.1
を使用。後で、vagrantでCoreOS起動したらvboxnet5としてこのaddressが使われていたので。
$ curl -L https://github.com/coreos/etcd/releases/download/v0.4.6/etcd-v0.4.6-darwin-amd64.zip -o etcd-v 0.4.6-darwin-amd64.zip.
$ unzip etcd-v0.4.6-darwin-amd64.zip.
$ cd etcd-v0.4.6-darwin-amd64
ochi:etcd-v0.4.6-darwin-amd64
$ pochi:etcd-v0.4.6-darwin-amd64
$ ./etcd -addr=172.17.8.1:4001 -v
[etcd] Nov 26 07:09:58.687 WARNING | Using the directory pochi.local.etcd as the etcd curation directory because a directory was not specified.
[etcd] Nov 26 07:09:58.690 DEBUG | pochi.local finished load snapshot
[etcd] Nov 26 07:09:58.698 DEBUG | URLs: /_etcd/machines: / pochi.local (http://127.0.0.1:7001)
[etcd] Nov 26 07:09:58.698 INFO | Peer URLs in log: / pochi.local (http://127.0.0.1:7001)
[etcd] Nov 26 07:09:58.699 DEBUG | pochi.local is restarting the cluster []
[etcd] Nov 26 07:09:58.700 INFO | etcd server [name pochi.local, listen on :4001, advertised url http://172.17.8.1:4001]
[etcd] Nov 26 07:09:58.700 INFO | peer server [name pochi.local, listen on :7001, advertised url http://127.0.0.1:7001]
[etcd] Nov 26 07:09:58.700 INFO | pochi.local starting in peer mode
[etcd] Nov 26 07:09:58.700 INFO | pochi.local: state changed from 'initialized' to 'follower'.
[etcd] Nov 26 07:09:58.933 INFO | pochi.local: state changed from 'follower' to 'candidate'.
[etcd] Nov 26 07:09:58.933 INFO | pochi.local: state changed from 'candidate' to 'leader'.
[etcd] Nov 26 07:09:58.933 INFO | pochi.local: leader changed from '' to 'pochi.local'.
[etcd] Nov 26 07:09:59.701 DEBUG | URLs: /_etcd/machines: / (pochi.local)
[etcd] Nov 26 07:10:00.703 DEBUG | URLs: /_etcd/machines: / (pochi.local)
[etcd] Nov 26 07:10:01.700 DEBUG | URLs: /_etcd/machines: / (pochi.local)
[etcd] Nov 26 07:10:01.704 INFO | pochi.local: snapshot of 18908 events at index 18908 completed
[etcd] Nov 26 07:10:02.701 DEBUG | URLs: /_etcd/machines: / (pochi.local)
[etcd] Nov 26 07:10:03.701 DEBUG | URLs: /_etcd/machines: / (pochi.local)
$ curl http://172.17.8.1:4001/version
etcd 0.4.6
CoreOS Vagrant
久しぶりにvagrantさわったら、バージョンが1.3.xから1.6.xになってた。当然のように、errorがでるので、こちらを参考に、rm -r ~/.vagrant.d/plugins.json ~/.vagrant.d/gems
を実行。
そしてcp user-data.sample user-data
して、user-dataのdiscovery行を編集。元のsampleはhttpsになっているのでご注意。
#cloud-config
coreos:
etcd:
# generate a new token for each unique cluster from https://discovery.etcd.io/new
# WARNING: replace each time you 'vagrant destroy'
discovery: http://172.17.8.1:4001/v2/keys/machines
addr: $public_ipv4:4001
peer-addr: $public_ipv4:7001
fleet:
public-ip: $public_ipv4
units:
- name: etcd.service
command: start
- name: fleet.service
command: start
- name: docker-tcp.socket
command: start
enable: true
content: |
[Unit]
Description=Docker Socket for the API
[Socket]
ListenStream=2375
Service=docker.service
BindIPv6Only=both
[Install]
WantedBy=sockets.target
無事vagrant upできた
Vagrantを1.3.5から1.6.3に上げたよ(Vagrant Cloudを使うようになった)
$ vagrant up
Bringing machine 'core-01' up with 'virtualbox' provider...
Bringing machine 'core-02' up with 'virtualbox' provider...
Bringing machine 'core-03' up with 'virtualbox' provider...
==> core-01: Importing base box 'coreos-alpha'...
==> core-01: Matching MAC address for NAT networking...
==> core-01: Checking if box 'coreos-alpha' is up to date...
==> core-01: Setting the name of the VM: coreos-vagrant_core-01_1416954842606_17056
==> core-01: Clearing any previously set network interfaces...
==> core-01: Preparing network interfaces based on configuration...
core-01: Adapter 1: nat
core-01: Adapter 2: hostonly
==> core-01: Forwarding ports...
core-01: 22 => 2222 (adapter 1)
==> core-01: Running 'pre-boot' VM customizations...
==> core-01: Booting VM...
==> core-01: Waiting for machine to boot. This may take a few minutes...
core-01: SSH address: 127.0.0.1:2222
core-01: SSH username: core
core-01: SSH auth method: private key
core-01: Warning: Connection timeout. Retrying...
==> core-01: Machine booted and ready!
==> core-01: Setting hostname...
==> core-01: Configuring and enabling network interfaces...
==> core-01: Running provisioner: file...
==> core-01: Running provisioner: shell...
core-01: Running: inline script
==> core-02: Importing base box 'coreos-alpha'...
==> core-02: Matching MAC address for NAT networking...
==> core-02: Checking if box 'coreos-alpha' is up to date...
==> core-02: Setting the name of the VM: coreos-vagrant_core-02_1416954865219_3677
==> core-02: Fixed port collision for 22 => 2222. Now on port 2200.
==> core-02: Clearing any previously set network interfaces...
==> core-02: Preparing network interfaces based on configuration...
core-02: Adapter 1: nat
core-02: Adapter 2: hostonly
==> core-02: Forwarding ports...
core-02: 22 => 2200 (adapter 1)
==> core-02: Running 'pre-boot' VM customizations...
==> core-02: Booting VM...
==> core-02: Waiting for machine to boot. This may take a few minutes...
core-02: SSH address: 127.0.0.1:2200
core-02: SSH username: core
core-02: SSH auth method: private key
core-02: Warning: Connection timeout. Retrying...
==> core-02: Machine booted and ready!
==> core-02: Setting hostname...
==> core-02: Configuring and enabling network interfaces...
==> core-02: Running provisioner: file...
==> core-02: Running provisioner: shell...
core-02: Running: inline script
==> core-03: Importing base box 'coreos-alpha'...
==> core-03: Matching MAC address for NAT networking...
==> core-03: Checking if box 'coreos-alpha' is up to date...
==> core-03: Setting the name of the VM: coreos-vagrant_core-03_1416954886820_7707
==> core-03: Fixed port collision for 22 => 2222. Now on port 2201.
==> core-03: Clearing any previously set network interfaces...
==> core-03: Preparing network interfaces based on configuration...
core-03: Adapter 1: nat
core-03: Adapter 2: hostonly
==> core-03: Forwarding ports...
core-03: 22 => 2201 (adapter 1)
==> core-03: Running 'pre-boot' VM customizations...
==> core-03: Booting VM...
==> core-03: Waiting for machine to boot. This may take a few minutes...
core-03: SSH address: 127.0.0.1:2201
core-03: SSH username: core
core-03: SSH auth method: private key
core-03: Warning: Connection timeout. Retrying...
==> core-03: Machine booted and ready!
==> core-03: Setting hostname...
==> core-03: Configuring and enabling network interfaces...
==> core-03: Running provisioner: file...
==> core-03: Running provisioner: shell...
core-03: Running: inline script```
vagrant status
で確認
$ vagrant status
Current machine states:
core-01 running (virtualbox)
core-02 running (virtualbox)
core-03 running (virtualbox)
This environment represents multiple VMs. The VMs are all listed
above with their current state. For more information about a specific
VM, run `vagrant status NAME`.
pochi:coreos-vagrant snumano$
作成した1台目にsshしてfleetコマンドで確認。ok
他の2台を確認しても内容は同じ。
$ vagrant ssh core-01
CoreOS (alpha)
core@core-01 ~ $ fleetctl list-machines -l
MACHINE IP METADATA
6562797c17224c35acb666f52bad25e0 172.17.8.102 -
8bb4edab9b71401f9455ae2b0c63f6f1 172.17.8.103 -
c595df046ef940d1b0a384225c4d528c 172.17.8.101 -
core@core-01 ~ $
$ vagrant ssh core-02
CoreOS (alpha)
core@core-02 ~ $ fleetctl list-machines -l
MACHINE IP METADATA
6562797c17224c35acb666f52bad25e0 172.17.8.102 -
8bb4edab9b71401f9455ae2b0c63f6f1 172.17.8.103 -
c595df046ef940d1b0a384225c4d528c 172.17.8.101 -
core@core-02 ~ $
$ vagrant ssh core-03
CoreOS (alpha)
core@core-03 ~ $ fleetctl list-machines -l
MACHINE IP METADATA
6562797c17224c35acb666f52bad25e0 172.17.8.102 -
8bb4edab9b71401f9455ae2b0c63f6f1 172.17.8.103 -
c595df046ef940d1b0a384225c4d528c 172.17.8.101 -
core@core-03 ~ $
これでCoreOSクラスタ構築完了
#Fleetでdeploy
上記を参考に、macにFleetをinstall。macから下記のようにfleetctlコマンドでCoreOSクラスタを確認。endpointは先に設定したetcdのdiscovery。
$ cd <fleet dir>
$ bin/fleetctl --endpoint 'http://172.17.8.101:4001' list-machines -l
MACHINE IP METADATA
6562797c17224c35acb666f52bad25e0 172.17.8.102 -
8bb4edab9b71401f9455ae2b0c63f6f1 172.17.8.103 -
c595df046ef940d1b0a384225c4d528c 172.17.8.101 -
後の手順をあわせるために、環境変数FLEETCTL_ENDPOINTを設定。
$ export FLEETCTL_ENDPOINT=http://172.17.8.101:4001
$ bin/fleetctl list-machines -l
MACHINE IP METADATA
6562797c17224c35acb666f52bad25e0 172.17.8.102 -
8bb4edab9b71401f9455ae2b0c63f6f1 172.17.8.103 -
c595df046ef940d1b0a384225c4d528c 172.17.8.101 -
unitの追加は、Qiitaの順番では上手く行かなかったので、githubを参考にした。
2台目(.102)のCoreOSでサービスが起動していることがわかる。
$ bin/fleetctl start examples/hello.service
Unit hello.service launched on 2ae5a2c5.../172.17.8.101
$ bin/fleetctl list-units
UNIT MACHINE ACTIVE SUB
hello.service 2ae5a2c5.../172.17.8.101 active running
$ bin/fleetctl list-unit-files
UNIT HASH DSTATE STATE TARGET
hello.service e55c0ae launched launched 2ae5a2c5.../172.17.8.101
ここのRemote Fleet accessのvagrantを参考にした。
$ vagrant ssh-config
Host core-01
HostName 127.0.0.1
User core
Port 2222
UserKnownHostsFile /dev/null
StrictHostKeyChecking no
PasswordAuthentication no
IdentityFile /Users/snumano/.vagrant.d/insecure_private_key
IdentitiesOnly yes
LogLevel FATAL
Host core-02
HostName 127.0.0.1
User core
Port 2200
UserKnownHostsFile /dev/null
StrictHostKeyChecking no
PasswordAuthentication no
IdentityFile /Users/snumano/.vagrant.d/insecure_private_key
IdentitiesOnly yes
LogLevel FATAL
Host core-03
HostName 127.0.0.1
User core
Port 2201
UserKnownHostsFile /dev/null
StrictHostKeyChecking no
PasswordAuthentication no
IdentityFile /Users/snumano/.vagrant.d/insecure_private_key
IdentitiesOnly yes
LogLevel FATAL
$ vagrant ssh-config | sed -n "s/IdentityFile//gp" | xargs ssh-add
Identity added: /Users/snumano/.vagrant.d/insecure_private_key (/Users/snumano/.vagrant.d/insecure_private_key)
Identity added: /Users/snumano/.vagrant.d/insecure_private_key (/Users/snumano/.vagrant.d/insecure_private_key)
Identity added: /Users/snumano/.vagrant.d/insecure_private_key (/Users/snumano/.vagrant.d/insecure_private_key)
$ export FLEETCTL_TUNNEL="$(vagrant ssh-config | sed -n "s/[ ]*HostName[ ]*//gp"):$(vagrant ssh-config | sed -n "s/[ ]*Port[ ]*//gp")"
$FLEETCTL_TUNNEL
127.0.0.1 127.0.0.1 127.0.0.1:2222 2200 2201
サービスの確認。
$ bin/fleetctl status hello.service
● hello.service - Hello World
Loaded: loaded (/run/fleet/units/hello.service; linked-runtime)
Active: active (running) since Wed 2014-11-26 21:47:05 UTC; 3min 44s ago
Main PID: 1335 (bash)
CGroup: /system.slice/hello.service
├─1335 /bin/bash -c while true; do echo "Hello, world"; sleep 1; done
└─1609 sleep 1
Nov 26 21:50:40 core-01 bash[1335]: Hello, world
Nov 26 21:50:41 core-01 bash[1335]: Hello, world
Nov 26 21:50:42 core-01 bash[1335]: Hello, world
Nov 26 21:50:43 core-01 bash[1335]: Hello, world
Nov 26 21:50:44 core-01 bash[1335]: Hello, world
Nov 26 21:50:45 core-01 bash[1335]: Hello, world
Nov 26 21:50:46 core-01 bash[1335]: Hello, world
Nov 26 21:50:47 core-01 bash[1335]: Hello, world
Nov 26 21:50:48 core-01 bash[1335]: Hello, world
Nov 26 21:50:49 core-01 bash[1335]: Hello, world
サービスが起動しているCoreOSにssh接続。
core-01に接続している事がわかる。
$ bin/fleetctl list-unit-files
UNIT HASH DSTATE STATE TARGET
hello.service e55c0ae launched launched 2ae5a2c5.../172.17.8.101
$ bin/fleetctl list-units
UNIT MACHINE ACTIVE SUB
hello.service 2ae5a2c5.../172.17.8.101 active running
$ bin/fleetctl ssh hello.service
Last login: Wed Nov 26 21:50:50 2014 from 172.17.8.1
CoreOS (alpha)
core@core-01 ~ $
failoverを試す
上記をそのままに動作確認できました。
$ bin/fleetctl list-machines -l
MACHINE IP METADATA
2ae5a2c5f8d145a2a10a5b6ca44e5ac2 172.17.8.101 -
5571847dfe604f56a241aedaef1fa134 172.17.8.102 -
60c954604b86414888ba2051bdadfe1b 172.17.8.103 -
$ bin/fleetctl list-unit-files
UNIT HASH DSTATE STATE TARGET
hello.service e55c0ae launched inactive 5571847d.../172.17.8.102
$ bin/fleetctl list-units
UNIT MACHINE ACTIVE SUB
hello.service 5571847d.../172.17.8.102 active running
別のterminalでcore-02を停止
$ vagrant suspend core-02
==> core-02: Saving VM state and suspending execution...
pochi:coreos-vagrant snumano$ vagrant status
Current machine states:
core-01 running (virtualbox)
core-02 saved (virtualbox)
core-03 running (virtualbox)```
core-02がなくなっているのがわかる。ただし、サービスはcore-01で動作している
$ bin/fleetctl list-machines -l
MACHINE IP METADATA
2ae5a2c5f8d145a2a10a5b6ca44e5ac2 172.17.8.101 -
60c954604b86414888ba2051bdadfe1b 172.17.8.103 -
$ bin/fleetctl list-unit-files
UNIT HASH DSTATE STATE TARGET
hello.service e55c0ae launched launched 2ae5a2c5.../172.17.8.101
$ bin/fleetctl list-units
UNIT MACHINE ACTIVE SUB
hello.service 2ae5a2c5.../172.17.8.101 active running
$ bin/fleetctl status hello.service
● hello.service - Hello World
Loaded: loaded (/run/fleet/units/hello.service; linked-runtime)
Active: active (running) since Wed 2014-11-26 21:55:55 UTC; 32s ago
Main PID: 1925 (bash)
CGroup: /system.slice/hello.service
├─1925 /bin/bash -c while true; do echo "Hello, world"; sleep 1; done
└─1958 sleep 1
Nov 26 21:56:18 core-01 bash[1925]: Hello, world
Nov 26 21:56:19 core-01 bash[1925]: Hello, world
Nov 26 21:56:20 core-01 bash[1925]: Hello, world
Nov 26 21:56:21 core-01 bash[1925]: Hello, world
Nov 26 21:56:22 core-01 bash[1925]: Hello, world
Nov 26 21:56:23 core-01 bash[1925]: Hello, world
Nov 26 21:56:24 core-01 bash[1925]: Hello, world
Nov 26 21:56:25 core-01 bash[1925]: Hello, world
Nov 26 21:56:26 core-01 bash[1925]: Hello, world
Nov 26 21:56:27 core-01 bash[1925]: Hello, world
ハマりポイント
vagrantで作成したクラスタをvagrant destroy
で削除後、再度vagrant up
で作成した場合、fleetコマンドが正常動作せず、errorを出しました。
$ bin/fleetctl --endpoint 'http://172.17.8.101:4001' list-machines -l
2014/11/26 23:49:22 INFO client.go:291: Failed getting response from http://172.17.8.101:4001/: dial tcp 172.17.8.101:4001: connection refused
2014/11/26 23:49:22 ERROR client.go:213: Unable to get result for {Get /_coreos.com/fleet/machines}, retrying in 100ms
2014/11/26 23:49:22 INFO client.go:291: Failed getting response from http://172.17.8.101:4001/: dial tcp 172.17.8.101:4001: connection refused
2014/11/26 23:49:22 ERROR client.go:213: Unable to get result for {Get /_coreos.com/fleet/machines}, retrying in 200ms
2014/11/26 23:49:22 INFO client.go:291: Failed getting response from http://172.17.8.101:4001/: dial tcp 172.17.8.101:4001: connection refused
2014/11/26 23:49:22 ERROR client.go:213: Unable to get result for {Get /_coreos.com/fleet/machines}, retrying in 400ms
2014/11/26 23:49:22 INFO client.go:291: Failed getting response from http://172.17.8.101:4001/: dial tcp 172.17.8.101:4001: connection refused
2014/11/26 23:49:22 ERROR client.go:213: Unable to get result for {Get /_coreos.com/fleet/machines}, retrying in 800ms
---snip---
原因は不明ですが、etcdのdirの中に作成されていた.etcdというdirを削除して数十秒後に、fleetctlが正常動作するようになりました。
$ pwd
<hogehoge>/etcd-v0.4.6-darwin-amd64
$ ls
README-etcd.md etcd pochi.local.etcd
README-etcdctl.md etcdctl
$ rm -fr pochi.local.etcd