11
11

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 5 years have passed since last update.

CoreOSでetcd,fleetやってみた

Last updated at Posted at 2014-11-27

CoreOS が提供してくれるものが興味深かったので、自分の環境で、それぞれ最新のversionで試してみました。

環境

  • Macosx 10.9.5
  • Vagrant 1.6.5
  • VirtualBox 4.3.20
  • Golang 1.3.3

環境準備

etcd

etcd構成概要

上記を参考に、etcdを起動。
githubIt is strongly recommended that users work with the latest 0.4.x release (0.4.6)とあるので、最新の0.5は避けて、0.4.6を使用。
etcdのoptionはetcd --helpで確認すると、Qiitaの情報とは変わっていた。-addr=で指定するaddressは172.17.8.1を使用。後で、vagrantでCoreOS起動したらvboxnet5としてこのaddressが使われていたので。

$ curl -L  https://github.com/coreos/etcd/releases/download/v0.4.6/etcd-v0.4.6-darwin-amd64.zip -o etcd-v 0.4.6-darwin-amd64.zip.
$ unzip etcd-v0.4.6-darwin-amd64.zip.
$ cd etcd-v0.4.6-darwin-amd64
ochi:etcd-v0.4.6-darwin-amd64
$ pochi:etcd-v0.4.6-darwin-amd64
$ ./etcd -addr=172.17.8.1:4001 -v
[etcd] Nov 26 07:09:58.687 WARNING   | Using the directory pochi.local.etcd as the etcd curation directory because a directory was not specified.
[etcd] Nov 26 07:09:58.690 DEBUG     | pochi.local finished load snapshot
[etcd] Nov 26 07:09:58.698 DEBUG     | URLs: /_etcd/machines:  / pochi.local (http://127.0.0.1:7001)
[etcd] Nov 26 07:09:58.698 INFO      | Peer URLs in log:  / pochi.local (http://127.0.0.1:7001)
[etcd] Nov 26 07:09:58.699 DEBUG     | pochi.local is restarting the cluster []
[etcd] Nov 26 07:09:58.700 INFO      | etcd server [name pochi.local, listen on :4001, advertised url http://172.17.8.1:4001]
[etcd] Nov 26 07:09:58.700 INFO      | peer server [name pochi.local, listen on :7001, advertised url http://127.0.0.1:7001]
[etcd] Nov 26 07:09:58.700 INFO      | pochi.local starting in peer mode
[etcd] Nov 26 07:09:58.700 INFO      | pochi.local: state changed from 'initialized' to 'follower'.
[etcd] Nov 26 07:09:58.933 INFO      | pochi.local: state changed from 'follower' to 'candidate'.
[etcd] Nov 26 07:09:58.933 INFO      | pochi.local: state changed from 'candidate' to 'leader'.
[etcd] Nov 26 07:09:58.933 INFO      | pochi.local: leader changed from '' to 'pochi.local'.
[etcd] Nov 26 07:09:59.701 DEBUG     | URLs: /_etcd/machines:  /  (pochi.local)
[etcd] Nov 26 07:10:00.703 DEBUG     | URLs: /_etcd/machines:  /  (pochi.local)
[etcd] Nov 26 07:10:01.700 DEBUG     | URLs: /_etcd/machines:  /  (pochi.local)
[etcd] Nov 26 07:10:01.704 INFO      | pochi.local: snapshot of 18908 events at index 18908 completed
[etcd] Nov 26 07:10:02.701 DEBUG     | URLs: /_etcd/machines:  /  (pochi.local)
[etcd] Nov 26 07:10:03.701 DEBUG     | URLs: /_etcd/machines:  /  (pochi.local)
$ curl http://172.17.8.1:4001/version
etcd 0.4.6

CoreOS Vagrant

久しぶりにvagrantさわったら、バージョンが1.3.xから1.6.xになってた。当然のように、errorがでるので、こちらを参考に、rm -r ~/.vagrant.d/plugins.json ~/.vagrant.d/gemsを実行。

そしてcp user-data.sample user-dataして、user-dataのdiscovery行を編集。元のsampleはhttpsになっているのでご注意。

#cloud-config

coreos:
  etcd:
    # generate a new token for each unique cluster from https://discovery.etcd.io/new
    # WARNING: replace each time you 'vagrant destroy'
    discovery: http://172.17.8.1:4001/v2/keys/machines
    addr: $public_ipv4:4001
    peer-addr: $public_ipv4:7001
  fleet:
    public-ip: $public_ipv4
  units:
    - name: etcd.service
      command: start
    - name: fleet.service
      command: start
    - name: docker-tcp.socket
      command: start
      enable: true
      content: |
        [Unit]
        Description=Docker Socket for the API

        [Socket]
        ListenStream=2375
        Service=docker.service
        BindIPv6Only=both

        [Install]
        WantedBy=sockets.target

無事vagrant upできた

Vagrantを1.3.5から1.6.3に上げたよ(Vagrant Cloudを使うようになった)

$ vagrant up
Bringing machine 'core-01' up with 'virtualbox' provider...
Bringing machine 'core-02' up with 'virtualbox' provider...
Bringing machine 'core-03' up with 'virtualbox' provider...
==> core-01: Importing base box 'coreos-alpha'...
==> core-01: Matching MAC address for NAT networking...
==> core-01: Checking if box 'coreos-alpha' is up to date...
==> core-01: Setting the name of the VM: coreos-vagrant_core-01_1416954842606_17056
==> core-01: Clearing any previously set network interfaces...
==> core-01: Preparing network interfaces based on configuration...
    core-01: Adapter 1: nat
    core-01: Adapter 2: hostonly
==> core-01: Forwarding ports...
    core-01: 22 => 2222 (adapter 1)
==> core-01: Running 'pre-boot' VM customizations...
==> core-01: Booting VM...
==> core-01: Waiting for machine to boot. This may take a few minutes...
    core-01: SSH address: 127.0.0.1:2222
    core-01: SSH username: core
    core-01: SSH auth method: private key
    core-01: Warning: Connection timeout. Retrying...
==> core-01: Machine booted and ready!
==> core-01: Setting hostname...
==> core-01: Configuring and enabling network interfaces...
==> core-01: Running provisioner: file...
==> core-01: Running provisioner: shell...
    core-01: Running: inline script
==> core-02: Importing base box 'coreos-alpha'...
==> core-02: Matching MAC address for NAT networking...
==> core-02: Checking if box 'coreos-alpha' is up to date...
==> core-02: Setting the name of the VM: coreos-vagrant_core-02_1416954865219_3677
==> core-02: Fixed port collision for 22 => 2222. Now on port 2200.
==> core-02: Clearing any previously set network interfaces...
==> core-02: Preparing network interfaces based on configuration...
    core-02: Adapter 1: nat
    core-02: Adapter 2: hostonly
==> core-02: Forwarding ports...
    core-02: 22 => 2200 (adapter 1)
==> core-02: Running 'pre-boot' VM customizations...
==> core-02: Booting VM...
==> core-02: Waiting for machine to boot. This may take a few minutes...
    core-02: SSH address: 127.0.0.1:2200
    core-02: SSH username: core
    core-02: SSH auth method: private key
    core-02: Warning: Connection timeout. Retrying...
==> core-02: Machine booted and ready!
==> core-02: Setting hostname...
==> core-02: Configuring and enabling network interfaces...
==> core-02: Running provisioner: file...
==> core-02: Running provisioner: shell...
    core-02: Running: inline script
==> core-03: Importing base box 'coreos-alpha'...
==> core-03: Matching MAC address for NAT networking...
==> core-03: Checking if box 'coreos-alpha' is up to date...
==> core-03: Setting the name of the VM: coreos-vagrant_core-03_1416954886820_7707
==> core-03: Fixed port collision for 22 => 2222. Now on port 2201.
==> core-03: Clearing any previously set network interfaces...
==> core-03: Preparing network interfaces based on configuration...
    core-03: Adapter 1: nat
    core-03: Adapter 2: hostonly
==> core-03: Forwarding ports...
    core-03: 22 => 2201 (adapter 1)
==> core-03: Running 'pre-boot' VM customizations...
==> core-03: Booting VM...
==> core-03: Waiting for machine to boot. This may take a few minutes...
    core-03: SSH address: 127.0.0.1:2201
    core-03: SSH username: core
    core-03: SSH auth method: private key
    core-03: Warning: Connection timeout. Retrying...
==> core-03: Machine booted and ready!
==> core-03: Setting hostname...
==> core-03: Configuring and enabling network interfaces...
==> core-03: Running provisioner: file...
==> core-03: Running provisioner: shell...
    core-03: Running: inline script```

vagrant statusで確認

$ vagrant status
Current machine states:

core-01                   running (virtualbox)
core-02                   running (virtualbox)
core-03                   running (virtualbox)

This environment represents multiple VMs. The VMs are all listed
above with their current state. For more information about a specific
VM, run `vagrant status NAME`.
pochi:coreos-vagrant snumano$

作成した1台目にsshしてfleetコマンドで確認。ok
他の2台を確認しても内容は同じ。

$ vagrant ssh core-01
CoreOS (alpha)
core@core-01 ~ $ fleetctl list-machines -l
MACHINE					IP		METADATA
6562797c17224c35acb666f52bad25e0	172.17.8.102	-
8bb4edab9b71401f9455ae2b0c63f6f1	172.17.8.103	-
c595df046ef940d1b0a384225c4d528c	172.17.8.101	-
core@core-01 ~ $
$ vagrant ssh core-02
CoreOS (alpha)
core@core-02 ~ $ fleetctl list-machines -l
MACHINE					IP		METADATA
6562797c17224c35acb666f52bad25e0	172.17.8.102	-
8bb4edab9b71401f9455ae2b0c63f6f1	172.17.8.103	-
c595df046ef940d1b0a384225c4d528c	172.17.8.101	-
core@core-02 ~ $
$ vagrant ssh core-03
CoreOS (alpha)
core@core-03 ~ $ fleetctl list-machines -l
MACHINE					IP		METADATA
6562797c17224c35acb666f52bad25e0	172.17.8.102	-
8bb4edab9b71401f9455ae2b0c63f6f1	172.17.8.103	-
c595df046ef940d1b0a384225c4d528c	172.17.8.101	-
core@core-03 ~ $

これでCoreOSクラスタ構築完了

#Fleetでdeploy

上記を参考に、macにFleetをinstall。macから下記のようにfleetctlコマンドでCoreOSクラスタを確認。endpointは先に設定したetcdのdiscovery。

$ cd <fleet dir>
$ bin/fleetctl --endpoint 'http://172.17.8.101:4001' list-machines -l
MACHINE					IP		METADATA
6562797c17224c35acb666f52bad25e0	172.17.8.102	-
8bb4edab9b71401f9455ae2b0c63f6f1	172.17.8.103	-
c595df046ef940d1b0a384225c4d528c	172.17.8.101	-

後の手順をあわせるために、環境変数FLEETCTL_ENDPOINTを設定。

$ export FLEETCTL_ENDPOINT=http://172.17.8.101:4001
$ bin/fleetctl list-machines -l
MACHINE					IP		METADATA
6562797c17224c35acb666f52bad25e0	172.17.8.102	-
8bb4edab9b71401f9455ae2b0c63f6f1	172.17.8.103	-
c595df046ef940d1b0a384225c4d528c	172.17.8.101	-

unitの追加は、Qiitaの順番では上手く行かなかったので、githubを参考にした。
2台目(.102)のCoreOSでサービスが起動していることがわかる。

$ bin/fleetctl start examples/hello.service
Unit hello.service launched on 2ae5a2c5.../172.17.8.101
$ bin/fleetctl list-units
UNIT		MACHINE				ACTIVE	SUB
hello.service	2ae5a2c5.../172.17.8.101	active	running
$ bin/fleetctl list-unit-files
UNIT		HASH	DSTATE		STATE		TARGET
hello.service	e55c0ae	launched	launched	2ae5a2c5.../172.17.8.101

ここのRemote Fleet accessのvagrantを参考にした。

$ vagrant ssh-config
Host core-01
  HostName 127.0.0.1
  User core
  Port 2222
  UserKnownHostsFile /dev/null
  StrictHostKeyChecking no
  PasswordAuthentication no
  IdentityFile /Users/snumano/.vagrant.d/insecure_private_key
  IdentitiesOnly yes
  LogLevel FATAL

Host core-02
  HostName 127.0.0.1
  User core
  Port 2200
  UserKnownHostsFile /dev/null
  StrictHostKeyChecking no
  PasswordAuthentication no
  IdentityFile /Users/snumano/.vagrant.d/insecure_private_key
  IdentitiesOnly yes
  LogLevel FATAL

Host core-03
  HostName 127.0.0.1
  User core
  Port 2201
  UserKnownHostsFile /dev/null
  StrictHostKeyChecking no
  PasswordAuthentication no
  IdentityFile /Users/snumano/.vagrant.d/insecure_private_key
  IdentitiesOnly yes
  LogLevel FATAL

$ vagrant ssh-config | sed -n "s/IdentityFile//gp" | xargs ssh-add
Identity added: /Users/snumano/.vagrant.d/insecure_private_key (/Users/snumano/.vagrant.d/insecure_private_key)
Identity added: /Users/snumano/.vagrant.d/insecure_private_key (/Users/snumano/.vagrant.d/insecure_private_key)
Identity added: /Users/snumano/.vagrant.d/insecure_private_key (/Users/snumano/.vagrant.d/insecure_private_key)
$ export FLEETCTL_TUNNEL="$(vagrant ssh-config | sed -n "s/[ ]*HostName[ ]*//gp"):$(vagrant ssh-config | sed -n "s/[ ]*Port[ ]*//gp")"
$FLEETCTL_TUNNEL
127.0.0.1 127.0.0.1 127.0.0.1:2222 2200 2201

サービスの確認。

$ bin/fleetctl status hello.service
● hello.service - Hello World
   Loaded: loaded (/run/fleet/units/hello.service; linked-runtime)
   Active: active (running) since Wed 2014-11-26 21:47:05 UTC; 3min 44s ago
 Main PID: 1335 (bash)
   CGroup: /system.slice/hello.service
           ├─1335 /bin/bash -c while true; do echo "Hello, world"; sleep 1; done
           └─1609 sleep 1

Nov 26 21:50:40 core-01 bash[1335]: Hello, world
Nov 26 21:50:41 core-01 bash[1335]: Hello, world
Nov 26 21:50:42 core-01 bash[1335]: Hello, world
Nov 26 21:50:43 core-01 bash[1335]: Hello, world
Nov 26 21:50:44 core-01 bash[1335]: Hello, world
Nov 26 21:50:45 core-01 bash[1335]: Hello, world
Nov 26 21:50:46 core-01 bash[1335]: Hello, world
Nov 26 21:50:47 core-01 bash[1335]: Hello, world
Nov 26 21:50:48 core-01 bash[1335]: Hello, world
Nov 26 21:50:49 core-01 bash[1335]: Hello, world

サービスが起動しているCoreOSにssh接続。
core-01に接続している事がわかる。

$ bin/fleetctl list-unit-files
UNIT		HASH	DSTATE		STATE		TARGET
hello.service	e55c0ae	launched	launched	2ae5a2c5.../172.17.8.101
$ bin/fleetctl list-units
UNIT		MACHINE				ACTIVE	SUB
hello.service	2ae5a2c5.../172.17.8.101	active	running
$ bin/fleetctl ssh hello.service
Last login: Wed Nov 26 21:50:50 2014 from 172.17.8.1
CoreOS (alpha)
core@core-01 ~ $

failoverを試す

上記をそのままに動作確認できました。

$ bin/fleetctl list-machines -l
MACHINE					IP		METADATA
2ae5a2c5f8d145a2a10a5b6ca44e5ac2	172.17.8.101	-
5571847dfe604f56a241aedaef1fa134	172.17.8.102	-
60c954604b86414888ba2051bdadfe1b	172.17.8.103	-
$ bin/fleetctl list-unit-files
UNIT		HASH	DSTATE		STATE		TARGET
hello.service	e55c0ae	launched	inactive	5571847d.../172.17.8.102
$ bin/fleetctl list-units
UNIT		MACHINE				ACTIVE	SUB
hello.service	5571847d.../172.17.8.102	active	running

別のterminalでcore-02を停止

$ vagrant suspend core-02
==> core-02: Saving VM state and suspending execution...
pochi:coreos-vagrant snumano$ vagrant status
Current machine states:

core-01                   running (virtualbox)
core-02                   saved (virtualbox)
core-03                   running (virtualbox)```

core-02がなくなっているのがわかる。ただし、サービスはcore-01で動作している

$ bin/fleetctl list-machines -l
MACHINE					IP		METADATA
2ae5a2c5f8d145a2a10a5b6ca44e5ac2	172.17.8.101	-
60c954604b86414888ba2051bdadfe1b	172.17.8.103	-
$ bin/fleetctl list-unit-files
UNIT		HASH	DSTATE		STATE		TARGET
hello.service	e55c0ae	launched	launched	2ae5a2c5.../172.17.8.101
$ bin/fleetctl list-units
UNIT		MACHINE				ACTIVE	SUB
hello.service	2ae5a2c5.../172.17.8.101	active	running
$ bin/fleetctl status hello.service
● hello.service - Hello World
   Loaded: loaded (/run/fleet/units/hello.service; linked-runtime)
   Active: active (running) since Wed 2014-11-26 21:55:55 UTC; 32s ago
 Main PID: 1925 (bash)
   CGroup: /system.slice/hello.service
           ├─1925 /bin/bash -c while true; do echo "Hello, world"; sleep 1; done
           └─1958 sleep 1

Nov 26 21:56:18 core-01 bash[1925]: Hello, world
Nov 26 21:56:19 core-01 bash[1925]: Hello, world
Nov 26 21:56:20 core-01 bash[1925]: Hello, world
Nov 26 21:56:21 core-01 bash[1925]: Hello, world
Nov 26 21:56:22 core-01 bash[1925]: Hello, world
Nov 26 21:56:23 core-01 bash[1925]: Hello, world
Nov 26 21:56:24 core-01 bash[1925]: Hello, world
Nov 26 21:56:25 core-01 bash[1925]: Hello, world
Nov 26 21:56:26 core-01 bash[1925]: Hello, world
Nov 26 21:56:27 core-01 bash[1925]: Hello, world

ハマりポイント

vagrantで作成したクラスタをvagrant destroyで削除後、再度vagrant upで作成した場合、fleetコマンドが正常動作せず、errorを出しました。

$ bin/fleetctl --endpoint 'http://172.17.8.101:4001' list-machines -l
2014/11/26 23:49:22 INFO client.go:291: Failed getting response from http://172.17.8.101:4001/: dial tcp 172.17.8.101:4001: connection refused
2014/11/26 23:49:22 ERROR client.go:213: Unable to get result for {Get /_coreos.com/fleet/machines}, retrying in 100ms
2014/11/26 23:49:22 INFO client.go:291: Failed getting response from http://172.17.8.101:4001/: dial tcp 172.17.8.101:4001: connection refused
2014/11/26 23:49:22 ERROR client.go:213: Unable to get result for {Get /_coreos.com/fleet/machines}, retrying in 200ms
2014/11/26 23:49:22 INFO client.go:291: Failed getting response from http://172.17.8.101:4001/: dial tcp 172.17.8.101:4001: connection refused
2014/11/26 23:49:22 ERROR client.go:213: Unable to get result for {Get /_coreos.com/fleet/machines}, retrying in 400ms
2014/11/26 23:49:22 INFO client.go:291: Failed getting response from http://172.17.8.101:4001/: dial tcp 172.17.8.101:4001: connection refused
2014/11/26 23:49:22 ERROR client.go:213: Unable to get result for {Get /_coreos.com/fleet/machines}, retrying in 800ms
---snip---

原因は不明ですが、etcdのdirの中に作成されていた.etcdというdirを削除して数十秒後に、fleetctlが正常動作するようになりました。

$ pwd
<hogehoge>/etcd-v0.4.6-darwin-amd64
$ ls
README-etcd.md		etcd			pochi.local.etcd
README-etcdctl.md	etcdctl
$ rm -fr pochi.local.etcd
11
11
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
11
11

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?