More than 1 year has passed since last update.

Prometheus: Telegrafを利用してInfluxDBにデータを格納する+Grafanaで可視化する

Posted at 2023-12-03

実施環境：

Linux

[root@testhost ~]# uname -a
Linux testhost 4.18.0-448.el8.x86_64 #1 SMP Wed Jan 18 15:02:46 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
[root@testhost ~]# cat /etc/redhat-release
CentOS Stream release 8
[root@testhost ~]# yum list installed | grep telegraf
telegraf.x86_64                                    1.28.3-1                                                   @@commandline
[root@testhost ~]# yum list installed | grep influxdb
influxdb2.x86_64                                   2.7.3-1                                                    @@commandline
[root@testhost ~]# yum list installed | grep grafana
grafana-enterprise.x86_64                          10.0.3-1                                                   @@commandline

node_exporter：1.6.1.linux-amd64

0. 概要

システムのパフォーマンス情報を監視できるソフトウェアには、様々なものが存在します。
その1つが Prometheus で、クラウド環境に向いたソフトウェアとして注目を集めています。

Promtheus

しかしこの Prometheus は、長期のデータ保存にはあまり向いていません。
取得したデータを長期保存したい場合は、別のデータベースにデータを格納する必要があります。

データベースにも様々なものがありますが、その1つが時系列データに特化したデータベースである InfluxDB です。
今回はこの InfluxDB に、 Prometheus から取り込んだデータを格納しようと思います。

InfluxDB

ところが、 Prometheus と InfluxDB の接続は、古いバージョンではサポートされていたのですが、最新のバージョンではサポートされなくなってしまっています。

ではあきらめるしかないのかというと、実はいくつかの方法が残されています。
そのうちの1つが、 InfluxDB 向けに作成されたメトリクス収集エージェントである Telegraf を用いる方法です。

Telegraf

Telegraf はデータを取得しにいく先として Prometheus をサポートしているため、 Telegraf を経由することで間接的に Prometheus と InfluxDB をつなげることができてしまうのです。

今回は Telegraf を使って Prometheus で取得した情報を InfluxDB に格納し、ついでに格納したデータを Grafana で可視化することを目指します。

Grafana

1. 事前準備

Prometheus のインストールについては、以下の過去記事を参照してください。
なお、必要なのは「 node_exporter 」だけで、他のコンポーネントは不要です。

Linux: システム監視ソフト「Prometheus」を無料インストールしてみた

InfluxDB 、 Grafana のインストールについては、以下の過去記事を参照してください。

Linux: データ可視化ソフト「Grafana」を無料インストールしてみた＋「Prometheus」と連携させてみた

Linux: 時系列データに特化したDB、「InfluxDB」を無料インストールしてみた

Telegraf のインストールと InfluxDB との連携については、以下の過去記事を参照してください。

InfluxDB: Telegrafからデータを取得する

InfluxDB と Grafana との連携については、以下の過去記事を参照してください。

InfluxDB: データをGrafanaで可視化してみる

なお、本記事の前提として、全てのソフトウェアは同一のサーバ上で動作しているものとします。
また、設定についてはこれらの過去記事に準拠するものとします。

2. 設定

今回必要となるのは、 Telegraf の設定変更です。
設定変更前に、 Telegraf はいったん停止させておいてください。

Telegraf の設定ファイル /etc/telegraf/telegraf.conf の中で、 Prometheus に関する箇所は以下の通りです。

telegraf.conf

# # Read metrics from one or many prometheus clients
# [[inputs.prometheus]]
#   ## An array of urls to scrape metrics from.
#   urls = ["http://localhost:9100/metrics"]
#
#   ## Metric version controls the mapping from Prometheus metrics into Telegraf metrics.
#   ## See "Metric Format Configuration" in plugins/inputs/prometheus/README.md for details.
#   ## Valid options: 1, 2
#   # metric_version = 1
#
#   ## Url tag name (tag containing scrapped url. optional, default is "url")
#   # url_tag = "url"
#
#   ## Whether the timestamp of the scraped metrics will be ignored.
#   ## If set to true, the gather time will be used.
#   # ignore_timestamp = false
#
#   ## An array of Kubernetes services to scrape metrics from.
#   # kubernetes_services = ["http://my-service-dns.my-namespace:9100/metrics"]
#
#   ## Kubernetes config file to create client from.
#   # kube_config = "/path/to/kubernetes.config"
#
#   ## Scrape Pods
#   ## Enable scraping of k8s pods. Further settings as to which pods to scape
#   ## are determiend by the 'method' option below. When enabled, the default is
#   ## to use annotations to determine whether to scrape or not.
#   # monitor_kubernetes_pods = false
#
#   ## Scrape Pods Method
#   ## annotations: default, looks for specific pod annotations documented below
#   ## settings: only look for pods matching the settings provided, not
#   ##   annotations
#   ## settings+annotations: looks at pods that match annotations using the user
#   ##   defined settings
#   # monitor_kubernetes_pods_method = "annotations"
#
#   ## Scrape Pods 'annotations' method options
#   ## If set method is set to 'annotations' or 'settings+annotations', these
#   ## annotation flags are looked for:
#   ## - prometheus.io/scrape: Required to enable scraping for this pod. Can also
#   ##     use 'prometheus.io/scrape=false' annotation to opt-out entirely.
#   ## - prometheus.io/scheme: If the metrics endpoint is secured then you will
#   ##     need to set this to 'https' & most likely set the tls config
#   ## - prometheus.io/path: If the metrics path is not /metrics, define it with
#   ##     this annotation
#   ## - prometheus.io/port: If port is not 9102 use this annotation
#
#   ## Scrape Pods 'settings' method options
#   ## When using 'settings' or 'settings+annotations', the default values for
#   ## annotations can be modified using with the following options:
#   # monitor_kubernetes_pods_scheme = "http"
#   # monitor_kubernetes_pods_port = "9102"
#   # monitor_kubernetes_pods_path = "/metrics"
#
#   ## Get the list of pods to scrape with either the scope of
#   ## - cluster: the kubernetes watch api (default, no need to specify)
#   ## - node: the local cadvisor api; for scalability. Note that the config node_ip or the environment variable NODE_IP must be set to the host IP.
#   # pod_scrape_scope = "cluster"
#
#   ## Only for node scrape scope: node IP of the node that telegraf is running on.
#   ## Either this config or the environment variable NODE_IP must be set.
#   # node_ip = "10.180.1.1"
#
#   ## Only for node scrape scope: interval in seconds for how often to get updated pod list for scraping.
#   ## Default is 60 seconds.
#   # pod_scrape_interval = 60
#
#   ## Restricts Kubernetes monitoring to a single namespace
#   ##   ex: monitor_kubernetes_pods_namespace = "default"
#   # monitor_kubernetes_pods_namespace = ""
#   ## The name of the label for the pod that is being scraped.
#   ## Default is 'namespace' but this can conflict with metrics that have the label 'namespace'
#   # pod_namespace_label_name = "namespace"
#   # label selector to target pods which have the label
#   # kubernetes_label_selector = "env=dev,app=nginx"
#   # field selector to target pods
#   # eg. To scrape pods on a specific node
#   # kubernetes_field_selector = "spec.nodeName=$HOSTNAME"
#
#   ## Filter which pod annotations and labels will be added to metric tags
#   #
#   # pod_annotation_include = ["annotation-key-1"]
#   # pod_annotation_exclude = ["exclude-me"]
#   # pod_label_include = ["label-key-1"]
#   # pod_label_exclude = ["exclude-me"]
#
#   # cache refresh interval to set the interval for re-sync of pods list.
#   # Default is 60 minutes.
#   # cache_refresh_interval = 60
#
#   ## Scrape Services available in Consul Catalog
#   # [inputs.prometheus.consul]
#   #   enabled = true
#   #   agent = "http://localhost:8500"
#   #   query_interval = "5m"
#
#   #   [[inputs.prometheus.consul.query]]
#   #     name = "a service name"
#   #     tag = "a service tag"
#   #     url = 'http://{{if ne .ServiceAddress ""}}{{.ServiceAddress}}{{else}}{{.Address}}{{end}}:{{.ServicePort}}/{{with .ServiceMeta.metrics_path}}{{.}}{{else}}metrics{{end}}'
#   #     [inputs.prometheus.consul.query.tags]
#   #       host = "{{.Node}}"
#
#   ## Use bearer token for authorization. ('bearer_token' takes priority)
#   # bearer_token = "/path/to/bearer/token"
#   ## OR
#   # bearer_token_string = "abc_123"
#
#   ## HTTP Basic Authentication username and password. ('bearer_token' and
#   ## 'bearer_token_string' take priority)
#   # username = ""
#   # password = ""
#
#   ## Optional custom HTTP headers
#   # http_headers = {"X-Special-Header" = "Special-Value"}
#
#   ## Specify timeout duration for slower prometheus clients (default is 5s)
#   # timeout = "5s"
#
#   ## deprecated in 1.26; use the timeout option
#   # response_timeout = "5s"
#
#   ## HTTP Proxy support
#   # use_system_proxy = false
#   # http_proxy_url = ""
#
#   ## Optional TLS Config
#   # tls_ca = /path/to/cafile
#   # tls_cert = /path/to/certfile
#   # tls_key = /path/to/keyfile
#
#   ## Use TLS but skip chain & host verification
#   # insecure_skip_verify = false
#
#   ## Use the given name as the SNI server name on each URL
#   # tls_server_name = "myhost.example.org"
#
#   ## TLS renegotiation method, choose from "never", "once", "freely"
#   # tls_renegotiation_method = "never"
#
#   ## Enable/disable TLS
#   ## Set to true/false to enforce TLS being enabled/disabled. If not set,
#   ## enable TLS only if any of the other options are specified.
#   # tls_enable = true
#
#   ## Control pod scraping based on pod namespace annotations
#   ## Pass and drop here act like tagpass and tagdrop, but instead
#   ## of filtering metrics they filters pod candidates for scraping
#   #[inputs.prometheus.namespace_annotation_pass]
#   # annotation_key = ["value1", "value2"]
#   #[inputs.prometheus.namespace_annotation_drop]
#   # some_annotation_key = ["dont-scrape"]

非常に長いですが、大事なのは最初の方の2行だけです。
まずは [[inputs.prometheus]] のコメントアウトを外します。
次に urls のコメントアウトを外し、 Prometheus ( node_exporter )の URL を指定します。
今回はすべてのソフトウェアが同じサーバ上で動作しているので、デフォルト設定のままで大丈夫です。

telegraf.conf

# # Read metrics from one or many prometheus clients
[[inputs.prometheus]]
#   ## An array of urls to scrape metrics from.
urls = ["http://localhost:9100/metrics"]
#

設定を変更したら、 Telegraf を再び起動します。

3. 確認

InfluxDB の Web 画面から以下の条件で検索して、 Prometheus で取得した情報が格納されていることを確認します。

FROM : test_bucket
Filter : _measurement = node_cpu_seconds_total
検索期間 : Past 5m

情報がきちんと格納されていることが確認できました。

InfluxDB と Grafana が連携されていれば、同じ情報を Grafana からも確認できます。
「 SCRIPT EDITOR 」を押してクエリ文を確認し、 Grafana の Web 画面にコピーしてみましょう。
Grafana 上での検索期間は上と同じになるように Last 5 minutes を選択します。

クエリ文

from(bucket: "test_bucket")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_measurement"] == "node_cpu_seconds_total")
  |> aggregateWindow(every: v.windowPeriod, fn: mean, createEmpty: false)
  |> yield(name: "mean")

Grafana からも情報が確認できました。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up