はじめに
オンプレミスのサーバー群に CloudWatch Agent をインストールすることで、AWS 外の環境でもメトリクスやログの管理が出来ます。要望の一つにメトリクスやログを転送する経路はインターネットではなく、Site-to-Site VPN や Direct Connect 経由で送りたいという時があります。もちろん、インターネット経由でも HTTPS で暗号化されているため、インターネットが悪いわけではないですが会社のポリシーとして利用できないときがあります。
そこで、オンプレミスのサーバーに CloudWatch Agent をインストールして、Site-to-Site VPN 経由でメトリクスやログを管理する方法を検証しました。設定方法を紹介します。
構成図
構成図はこんな感じです。
オンプレミスから VPN 経由でアクセスするときに、以下の 3 種類の VPC Endpoint を作成することで管理が可能になります。
com.amazonaws.ap-northeast-1.monitoring
com.amazonaws.ap-northeast-1.logs
com.amazonaws.ap-northeast-1.ec2
実際に手順を紹介していきましょう。
VPC Endpoint
VPC Endpoint を作成します。Create endpoint を押します。
Service の欄から作成するサービスを選択します。以下の 3 つが必要なので、1 つずつ作成していきます。
com.amazonaws.ap-northeast-1.monitoring
com.amazonaws.ap-northeast-1.logs
com.amazonaws.ap-northeast-1.ec2
VPC Endpoint を作成する Subnet を指定します。
Create endpoint を押します。
Monitoring 用の Endpoint が作成されました。
3 つ全て作成をしました。
それぞれの Endpoint の DNS name はこんな感じです。
vpce-0a7925f3ac48e6078-fatg03ru.ec2.ap-northeast-1.vpce.amazonaws.com
vpce-0f44df69ed16672dd-scw1wfv7.logs.ap-northeast-1.vpce.amazonaws.com
vpce-0f7c537db64bb1a01-75pscc0n.monitoring.ap-northeast-1.vpce.amazonaws.com
それぞれの DNS names は、オンプレミス名前解決が可能です。Private IP が返ってきています。
[centos@ip-192-168-1-170 ~]$ dig vpce-0a7925f3ac48e6078-fatg03ru.ec2.ap-northeast-1.vpce.amazonaws.com A
; <<>> DiG 9.11.4-P2-RedHat-9.11.4-26.P2.el7_9.13 <<>> vpce-0a7925f3ac48e6078-fatg03ru.ec2.ap-northeast-1.vpce.amazonaws.com A
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 46054
;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;vpce-0a7925f3ac48e6078-fatg03ru.ec2.ap-northeast-1.vpce.amazonaws.com. IN A
;; ANSWER SECTION:
vpce-0a7925f3ac48e6078-fatg03ru.ec2.ap-northeast-1.vpce.amazonaws.com. 60 IN A 10.0.0.165
vpce-0a7925f3ac48e6078-fatg03ru.ec2.ap-northeast-1.vpce.amazonaws.com. 60 IN A 10.0.102.220
vpce-0a7925f3ac48e6078-fatg03ru.ec2.ap-northeast-1.vpce.amazonaws.com. 60 IN A 10.0.101.103
;; Query time: 2 msec
;; SERVER: 192.168.0.2#53(192.168.0.2)
;; WHEN: Sat Jun 17 07:08:43 UTC 2023
;; MSG SIZE rcvd: 146
[centos@ip-192-168-1-170 ~]$ dig vpce-0f44df69ed16672dd-scw1wfv7.logs.ap-northeast-1.vpce.amazonaws.com A
; <<>> DiG 9.11.4-P2-RedHat-9.11.4-26.P2.el7_9.13 <<>> vpce-0f44df69ed16672dd-scw1wfv7.logs.ap-northeast-1.vpce.amazonaws.com A
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 23648
;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;vpce-0f44df69ed16672dd-scw1wfv7.logs.ap-northeast-1.vpce.amazonaws.com. IN A
;; ANSWER SECTION:
vpce-0f44df69ed16672dd-scw1wfv7.logs.ap-northeast-1.vpce.amazonaws.com. 60 IN A 10.0.102.212
vpce-0f44df69ed16672dd-scw1wfv7.logs.ap-northeast-1.vpce.amazonaws.com. 60 IN A 10.0.101.153
vpce-0f44df69ed16672dd-scw1wfv7.logs.ap-northeast-1.vpce.amazonaws.com. 60 IN A 10.0.100.105
;; Query time: 1 msec
;; SERVER: 192.168.0.2#53(192.168.0.2)
;; WHEN: Sat Jun 17 07:09:12 UTC 2023
;; MSG SIZE rcvd: 147
[centos@ip-192-168-1-170 ~]$ dig vpce-0f7c537db64bb1a01-75pscc0n.monitoring.ap-northeast-1.vpce.amazonaws.com A
; <<>> DiG 9.11.4-P2-RedHat-9.11.4-26.P2.el7_9.13 <<>> vpce-0f7c537db64bb1a01-75pscc0n.monitoring.ap-northeast-1.vpce.amazonaws.com A
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 19057
;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;vpce-0f7c537db64bb1a01-75pscc0n.monitoring.ap-northeast-1.vpce.amazonaws.com. IN A
;; ANSWER SECTION:
vpce-0f7c537db64bb1a01-75pscc0n.monitoring.ap-northeast-1.vpce.amazonaws.com. 60 IN A 10.0.101.175
vpce-0f7c537db64bb1a01-75pscc0n.monitoring.ap-northeast-1.vpce.amazonaws.com. 60 IN A 10.0.100.63
vpce-0f7c537db64bb1a01-75pscc0n.monitoring.ap-northeast-1.vpce.amazonaws.com. 60 IN A 10.0.102.130
;; Query time: 2 msec
;; SERVER: 192.168.0.2#53(192.168.0.2)
;; WHEN: Sat Jun 17 07:09:24 UTC 2023
;; MSG SIZE rcvd: 153
IAM User の作成
オンプレミスサーバーに Secret Key を割り当てるため、IAM User を作成します。
適当に IAM User 名を指定します。
CloudWatchAgentServerPolicy を割りあてます。
Create user を押します。
IAM User が作成できたので、Create access key を押します。
Application running outside AWS を選択します。
Create access key を押します。
オンプレミス側
疑似オンプレミスとして AWS 上の EC2 として、Private Subnet に Linux マシンを配置します。
作成した Linux マシンに SSH して、wget でインストーラーをダウンロードします。
wget https://s3.amazonaws.com/amazoncloudwatch-agent/centos/amd64/latest/amazon-cloudwatch-agent.rpm
install します。
sudo yum localinstall amazon-cloudwatch-agent.rpm
CloudWatch Agent の設定を進めるために、ウィザードを起動します。
sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-config-wizard
Linux を選択
================================================================
= Welcome to the Amazon CloudWatch Agent Configuration Manager =
= =
= CloudWatch Agent allows you to collect metrics and logs from =
= your host and send them to CloudWatch. Additional CloudWatch =
= charges may apply. =
================================================================
On which OS are you planning to use the agent?
1. linux
2. windows
3. darwin
default choice: [1]:
On-Premises
Trying to fetch the default region based on ec2 metadata...
Are you using EC2 or On-Premises hosts?
1. EC2
2. On-Premises
default choice: [1]:
2
root
Please make sure the credentials and region set correctly on your hosts.
Refer to http://docs.aws.amazon.com/cli/latest/userguide/cli-chap-getting-started.html
Which user are you planning to run the agent?
1. root
2. cwagent
3. others
default choice: [1]:
1
1
1. yes
2. no
default choice: [1]:
1
default
Which port do you want StatsD daemon to listen to?
default choice: [8125]
default
What is the collect interval for StatsD daemon?
1. 10s
2. 30s
3. 60s
default choice: [1]:
default
What is the aggregation interval for metrics collected by StatsD daemon?
1. Do not aggregate
2. 10s
3. 30s
4. 60s
default choice: [4]:
2
Do you want to monitor metrics from CollectD? WARNING: CollectD must be installed or the Agent will fail to start
1. yes
2. no
default choice: [1]:
2
1
Do you want to monitor any host metrics? e.g. CPU, memory, etc.
1. yes
2. no
default choice: [1]:
1
1
Do you want to monitor cpu metrics per core?
1. yes
2. no
default choice: [1]:
1
4
Would you like to collect your metrics at high resolution (sub-minute resolution)? This enables sub-minute resolution for all metrics, but you can customize for specific metrics in the output json file.
1. 1s
2. 10s
3. 30s
4. 60s
default choice: [4]:
4
1
Which default metrics config do you want?
1. Basic
2. Standard
3. Advanced
4. None
default choice: [1]:
1
1
Are you satisfied with the above config? Note: it can be manually customized after the wizard completes to add additional items.
1. yes
2. no
default choice: [1]:
1
2
Do you have any existing CloudWatch Log Agent (http://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/AgentReference.html) configuration file to import for migration?
1. yes
2. no
default choice: [2]:
2
CloudWatch Logs に転送するファイル名を指定
/var/log/messages にしておきます。任意のものを指定できます。
Log file path:
/var/log/messages
残りはデフォルトで設定します。
Log group name:
default choice: [messages]
Log stream name:
default choice: [{hostname}]
Log Group Retention in days
1. -1
2. 1
3. 3
4. 5
5. 7
6. 14
7. 30
8. 60
9. 90
10. 120
11. 150
12. 180
13. 365
14. 400
15. 545
16. 731
17. 1827
18. 2192
19. 2557
20. 2922
21. 3288
22. 3653
default choice: [1]:
Do you want to specify any additional log files to monitor?
1. yes
2. no
default choice: [1]:
2
Saved config file to /opt/aws/amazon-cloudwatch-agent/bin/config.json successfully.
Current config as follows:
{
"agent": {
"metrics_collection_interval": 60,
"run_as_user": "root"
},
"logs": {
"logs_collected": {
"files": {
"collect_list": [
{
"file_path": "/var/log/messages",
"log_group_name": "messages",
"log_stream_name": "{hostname}",
"retention_in_days": -1
}
]
}
}
},
"metrics": {
"metrics_collected": {
"cpu": {
"measurement": [
"cpu_usage_idle"
],
"metrics_collection_interval": 60,
"resources": [
"*"
],
"totalcpu": true
},
"disk": {
"measurement": [
"used_percent"
],
"metrics_collection_interval": 60,
"resources": [
"*"
]
},
"diskio": {
"measurement": [
"write_bytes",
"read_bytes",
"writes",
"reads"
],
"metrics_collection_interval": 60,
"resources": [
"*"
]
},
"mem": {
"measurement": [
"mem_used_percent"
],
"metrics_collection_interval": 60
},
"net": {
"measurement": [
"bytes_sent",
"bytes_recv",
"packets_sent",
"packets_recv"
],
"metrics_collection_interval": 60,
"resources": [
"*"
]
},
"statsd": {
"metrics_aggregation_interval": 60,
"metrics_collection_interval": 10,
"service_address": ":8125"
},
"swap": {
"measurement": [
"swap_used_percent"
],
"metrics_collection_interval": 60
}
}
}
}
Please check the above content of the config.
The config file is also located at /opt/aws/amazon-cloudwatch-agent/bin/config.json.
Edit it manually if needed.
CloudWatch Agent の設定ファイルを手動で変更するために、まずはバックアップを取得しておきます。
cp -p /opt/aws/amazon-cloudwatch-agent/bin/config.json /opt/aws/amazon-cloudwatch-agent/bin/config.json.old
vim で編集します。
vim /opt/aws/amazon-cloudwatch-agent/bin/config.json
次の endpoint_override を指定することで、AWS 側で作成した VPC Endpoint を指定できます。
"endpoint_override": "vpce-XXXXXXXXXXXXXXXXXXXXXXXXX.monitoring.us-east-1.vpce.amazonaws.com",
編集後はこんな感じです。以下の 2 つの Endpoint を指定します。
- logs
- metrics
{
"agent": {
"metrics_collection_interval": 60,
"run_as_user": "root"
},
"logs": {
"endpoint_override": "vpce-0f44df69ed16672dd-scw1wfv7.logs.ap-northeast-1.vpce.amazonaws.com",
"logs_collected": {
"files": {
"collect_list": [
{
"file_path": "/var/log/messages",
"log_group_name": "messages",
"log_stream_name": "{hostname}",
"retention_in_days": -1
}
]
}
}
},
"metrics": {
"endpoint_override": "vpce-0f7c537db64bb1a01-75pscc0n.monitoring.ap-northeast-1.vpce.amazonaws.com",
"metrics_collected": {
"cpu": {
"measurement": [
"cpu_usage_idle"
],
"metrics_collection_interval": 60,
"resources": [
"*"
],
"totalcpu": true
},
"disk": {
"measurement": [
"used_percent"
],
"metrics_collection_interval": 60,
"resources": [
"*"
]
},
"diskio": {
"measurement": [
"write_bytes",
"read_bytes",
"writes",
"reads"
],
"metrics_collection_interval": 60,
"resources": [
"*"
]
},
"mem": {
"measurement": [
"mem_used_percent"
],
"metrics_collection_interval": 60
},
"net": {
"measurement": [
"bytes_sent",
"bytes_recv",
"packets_sent",
"packets_recv"
],
"metrics_collection_interval": 60,
"resources": [
"*"
]
},
"statsd": {
"metrics_aggregation_interval": 60,
"metrics_collection_interval": 10,
"service_address": ":8125"
},
"swap": {
"measurement": [
"swap_used_percent"
],
"metrics_collection_interval": 60
}
}
}
}
Credentias を設定するため、AWS CLI をインストールします。
curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
unzip awscliv2.zip
sudo ./aws/install
aws configure で Secret Key 等を指定します。
aws configure
CloudWatch Agent 側で、AWS CLI 経由で作成した credentials ファイルを指定します。
cp -p /opt/aws/amazon-cloudwatch-agent/etc/common-config.toml /opt/aws/amazon-cloudwatch-agent/etc/common-config.toml.old
vi /opt/aws/amazon-cloudwatch-agent/etc/common-config.toml
[credentials]
shared_credential_profile = "default"
shared_credential_file = "/root/.aws/credentials"
CloudWatch Agent を起動します。
sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m onPremise -s -c file:/opt/aws/amazon-cloudwatch-agent/bin/config.json
起動実行例です。
[root@ip-192-168-1-170 .aws]# sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m onPremise -s -c file:/opt/aws/amazon-cloudwatch-agent/bin/config.json
****** processing amazon-cloudwatch-agent ******
Got Home directory: /root I! Set home dir Linux: /root I! SDKRegionWithCredsMap region: ap-northeast-1 Successfully fetched the config and saved in /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.d/file_config.json.tmp
Start configuration validation...
2023/06/17 07:50:58 Reading json config file path: /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.d/file_config.json.tmp ...
2023/06/17 07:50:58 I! Valid Json input schema.
I! Detecting run_as_user...
Got Home directory: /root
I! Set home dir Linux: /root
I! SDKRegionWithCredsMap region: ap-northeast-1
No csm configuration found.
Under path : /logs/ | Info : Got hostname ip-192-168-1-170.ap-northeast-1.compute.internal as log_stream_name
Configuration validation first phase succeeded
/opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent -schematest -config /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.toml
Configuration validation second phase succeeded
Configuration validation succeeded
amazon-cloudwatch-agent has already been stopped
Created symlink from /etc/systemd/system/multi-user.target.wants/amazon-cloudwatch-agent.service to /etc/systemd/system/amazon-cloudwatch-agent.service.
Systemd 経由でも確認できるようになりました。
[root@ip-192-168-1-170 .aws]# systemctl status amazon-cloudwatch-agent
● amazon-cloudwatch-agent.service - Amazon CloudWatch Agent
Loaded: loaded (/etc/systemd/system/amazon-cloudwatch-agent.service; enabled; vendor preset: disabled)
Active: active (running) since Sat 2023-06-17 07:50:59 UTC; 37s ago
Main PID: 15965 (amazon-cloudwat)
CGroup: /system.slice/amazon-cloudwatch-agent.service
└─15965 /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent -config /opt/aws/amazon-cloudwatch-agent/...
Jun 17 07:50:59 ip-192-168-1-170.ap-northeast-1.compute.internal systemd[1]: Started Amazon CloudWatch Agent.
Jun 17 07:50:59 ip-192-168-1-170.ap-northeast-1.compute.internal start-amazon-cloudwatch-agent[15965]: /opt/aws/amazon-cl...
Jun 17 07:50:59 ip-192-168-1-170.ap-northeast-1.compute.internal start-amazon-cloudwatch-agent[15965]: I! Detecting run_a...
Hint: Some lines were ellipsized, use -l to show in full.
動作確認
上記の手順でオンプレミス側のサーバーを CloudWatch で管理できるようになりました。まず Logs から見ていきます。Log Group に messages というものが出来ています。
開いてみると、ホスト名が見えています。
このようにログの中身が転送されています。
CPU の Metrics 等も見えるようになりました。
参考 URL
endpoint_override
https://docs.aws.amazon.com/ja_jp/AmazonCloudWatch/latest/monitoring/CloudWatch-Agent-Configuration-File-Details.html