Help us understand the problem. What is going on with this article?

Terraform で Stackdriver の通知設定をする

3rd party ドキュメントがあまりなかったので書いておく

アラートチャンネル google_monitoring_notification_channel の作成

メール

メールはサンプルに書いてある

resource "google_monitoring_notification_channel" "basic" {
  display_name = "Test Notification Channel"
  type = "email"
  labels = {
    email_address = "fake_email@blahblah.com"
  }
}

Slack

次のように設定するらしいことはわかった。
しかし、 auth_token に設定する OAuth token の取得方法がわからなかった。情報求む。

resource "google_monitoring_notification_channel" "slack" {
  display_name = "Slack"
  type         = "slack"

  labels = {
    auth_token   = "SECRET"
    channel_name = "#channel"
  }
}

手動で設定する場合はドキュメント https://cloud.google.com/monitoring/support/notification-options#slack を参考にするとできた。

アラート設定 google_monitoring_alert_policy の作成

https://app.google.stackdriver.com/policies/create?project=プロジェクトID

から手動で作ってしまって、 右上にある JSON ボタンから設定をダウンロードするか ← JSONボタンがなくなってしまいました (2019/10)

image.png

$ gcloud alpha monitoring policies list

で設定情報を取得すると当たりをつけやすい。


例えば Memorystore Redis だと次のようなJSONをダウンロードできた。

{
  "combiner": "OR",
  "conditions": [
    {
      "conditionThreshold": {
        "aggregations": [
          {
            "alignmentPeriod": "60s",
            "crossSeriesReducer": "REDUCE_MEAN",
            "perSeriesAligner": "ALIGN_MEAN"
          }
        ],
        "comparison": "COMPARISON_GT",
        "duration": "0s",
        "filter": "resource.type=\"redis_instance\" AND metric.type=\"redis.googleapis.com/stats/memory/usage_ratio\"",
        "thresholdValue": 50,
        "trigger": {
          "count": 1
        }
      },
      "displayName": "High Memory Usage Ratio"
    }
  ],
  "displayName": "High Redis Memory Usage",
  "enabled": true,
  "notificationChannels": [
    "projects/プロジェクト名/notificationChannels/アラートチャンネルID"
  ]
}

これを翻訳して以下のように設定できた。

resource "google_monitoring_alert_policy" "redis_memory_usage" {
  display_name = "High Redis Memory Usage"
  combiner     = "OR"

  conditions {
    display_name = "High Memory Usage Ratio"

    condition_threshold {
      aggregations = [
        {
          alignment_period     = "60s"
          cross_series_reducer = "REDUCE_MEAN"
          per_series_aligner   = "ALIGN_MEAN"
        },
      ]

      comparison      = "COMPARISON_GT"
      duration        = "60s"
      filter          = "resource.type=\"redis_instance\" AND metric.type=\"redis.googleapis.com/stats/memory/usage_ratio\""
      threshold_value = "50"

      trigger = {
        count = 1
      }
    }
  }

  notification_channels = ["projects/プロジェクトID/notificationChannels/アラートチャンネルID"]
}

Stackdriver Logging のアラート設定

ログメトリクス google_logging_metric の作成

Web UI で作成し、gcloud logging metrics describe でパラメータを取得すると当たりをつけやすい。

$ gcloud logging metrics describe warning_k8s_pod_logging
createTime: '2019-05-09T13:07:56.249404155Z'
filter: resource.type="k8s_pod" AND severity=WARNING
metricDescriptor:
  metricKind: DELTA
  name: projects/プロジェクトID/metricDescriptors/logging.googleapis.com/user/warning_k8s_pod_logging
  type: logging.googleapis.com/user/warning_k8s_pod_logging
  unit: '1'
  valueType: INT64
name: warning_k8s_pod_logging
updateTime: '2019-05-09T13:07:56.249404155Z'

Web UI 上では指定しなかった metricKind と valueType が勝手に作られているのがポイントで、Terraform ではこれらの属性の指定が必須になっているので、真似して値を入れなければいけない(ここでだいぶハマった)。

resource "google_logging_metric" "warning_k8s_pod_logging" {
  name   = "warning_k8s_pod_logging"
  filter = "resource.type=\"k8s_pod\" AND severity=WARNING"

  metric_descriptor {
    metric_kind = "DELTA" # Default Dummy
    value_type  = "INT64" # Default Dummy
  }
}

なお、これは k8s pod のログで warning をカウントするメトリクスの作成。こういう書き方もできそう。

resource "google_logging_metric" "warning_k8s_pod_logging" {
  name   = "warning_k8s_pod_logging"
  filter = "resource.type=\"k8s_pod\""
  label_extractors = {
    severity = "REGEXP_EXTRACT(severity, \"(WARNING)\")"
  }
  metric_descriptor {
    labels = [{
      key = "severity"
    }]
    metric_kind = "DELTA" # Default Dummy
    value_type  = "INT64" # Default Dummy
  }
}

ログメトリクスに対するアラート設定の作成

基本的には他のサービスに対するアラート設定と同じで、Web UI で作ってしまって JSONをダウンロードしてしまうのが簡単で良い。

参考までに k8s pod のログで warning が出たら通知を飛ばす設定は以下のようになった。

resource "google_monitoring_alert_policy" "warning_k8s_pod_logging" {
  display_name = "WARNING K8s Pod Logging"
  combiner     = "OR"

  conditions {
    display_name = "WARNING K8s Pod Logging"

    condition_threshold {
      aggregations = [
        {
          alignment_period     = "60s"
          cross_series_reducer = "REDUCE_SUM"
          per_series_aligner   = "ALIGN_SUM"
        },
      ]

      comparison      = "COMPARISON_GT"
      duration        = "60s"
      filter          = "resource.type=\"k8s_pod\" AND metric.type=\"logging.googleapis.com/user/warning_k8s_pod_logging\""
      threshold_value = "0"

      trigger = {
        count = 1
      }
    }
  }

  notification_channels = ["projects/プロジェクトID/notificationChannels/アラートチャンネルID"]

  depends_on = ["google_logging_metric.warning_k8s_pod_logging"]
}
Why do not you register as a user and use Qiita more conveniently?
  1. We will deliver articles that match you
    By following users and tags, you can catch up information on technical fields that you are interested in as a whole
  2. you can read useful information later efficiently
    By "stocking" the articles you like, you can search right away
Comments
Sign up for free and join this conversation.
If you already have a Qiita account
Why do not you register as a user and use Qiita more conveniently?
You need to log in to use this function. Qiita can be used more conveniently after logging in.
You seem to be reading articles frequently this month. Qiita can be used more conveniently after logging in.
  1. We will deliver articles that match you
    By following users and tags, you can catch up information on technical fields that you are interested in as a whole
  2. you can read useful information later efficiently
    By "stocking" the articles you like, you can search right away