Intro
コンテナ向けに用意されている Azure Monitor for Containers では、最初から以下4つの Workbooks が提供されている。
- ディスク容量
- ディスクIO
- Kubelet
- ネットワーク
この記事では、これらの既に出来上がっている Workbooks から、Kustoクエリをひたすら参考にしていきたい。このデフォルトの Workbooks はいつでもどのクラスターでも参照できるので、チートシート的に使うととても便利である。以下に紹介する kusto クエリを隅々まで覚える必要は全くない。
Azure Monitor Workbooksとは
ログクエリ結果やグラフを複数まとめてレポート形式にできるもの。作成したレポートはチーム内で共有できるので、新メンバーがログのクエリ方法がわからない!という場合でも、「ああこのテレメトリを見れば○○がわかるのか」といったように監視に必要な知識も伝えることができる。
例えば以下は、ネットワークのWorkbooks画面。
最低限の Azure Monitor 基礎知識
クエリの書き方 基本
目次
- 基本的なクエリ
- スキーマの概要
- フィルター処理 e.g.
| where hogehoge == "hugahuga" - 並び替え e.g.
| sort - グループ化と集計 e.g.
| summarize - グラフ
- クエリの保存と読み込み
- 列の選択と計算 e.g.
| project column1, column2, column3 - 追加の列を定義 e.g.
| extend NewColumn1=substring(OriginalColumn1, 0, 5) - 時間列でグループ化(ビン分割) e.g.
| summarize avg(CounterValue) by bin(TimeGenerated, 1h)
収集データについて
ディスク容量、ディスクIO、ネットワーク、といったテレメトリは、InfluxData Telegraf エージェントによって収集されていて、これらは、InsightMetrics というカスタムメトリックスとしてクエリできる。
(参考:InfluxData Telegraf エージェントによって送信される仕組み https://docs.microsoft.com/ja-jp/azure/azure-monitor/platform/collect-custom-metrics-linux-telegraf
この InsightMetrics ログの Tags プロパティの中に各値が入っていて、Name と NameSpace でフィルタすることで各メトリクスを取得できる仕掛けになっている。これについては GitHub の奥深くに情報があったので、引用しておく。それぞれ InfluxData Telegraf ドキュメントへのリンクがついている。
- Disk metrics
| Name | Namespace | Description |
|---|---|---|
used |
container.azm.ms/disk |
more info |
free |
container.azm.ms/disk |
more info |
used_percent |
container.azm.ms/disk |
more info |
- Disk IO metrics
| Name | Namespace | Description |
|---|---|---|
reads |
container.azm.ms/diskio |
more info |
read_bytes |
container.azm.ms/diskio |
more info |
read_time |
container.azm.ms/diskio |
more info |
writes |
container.azm.ms/diskio |
more info |
write_bytes |
container.azm.ms/diskio |
more info |
write_time |
container.azm.ms/diskio |
more info |
io_time |
container.azm.ms/diskio |
more info |
iops_in_progress |
container.azm.ms/diskio |
more info |
- Host network metrics
| Name | Namespace | Description |
|---|---|---|
bytes_sent |
container.azm.ms/net |
more info |
bytes_received |
container.azm.ms/net |
more info |
err_in |
container.azm.ms/net |
more info |
err_out |
container.azm.ms/net |
more info |
- Kubelet metrics
| Name | Namespace | Description |
|---|---|---|
kubelet_docker_operations |
container.azm.ms/prometheus |
Cumulative number of Docker operations by operation type |
kubelet_docker_operations_errors |
container.azm.ms/prometheus |
Cumulative number of Docker operation errors by operation type |
(参考元: https://github.com/microsoft/OMS-docker/blob/vishwa/june19agentrel/docs/InsightsMetrics.md)
他のコンテナーレコードは以下ドキュメントに詳細がある。
https://docs.microsoft.com/ja-jp/azure/azure-monitor/insights/container-insights-log-search
- ホストとコンテナーのパフォーマンス:
Perf - コンテナー インベントリ:
ContainerInventory - コンテナー ログ:
ContainerLog - コンテナー ノード インベントリ:
ContainerNodeInventory - Kubernetes クラスター内のポッドのインベントリ:
KubePodInventory - Kubernetes クラスター内のノード部分のインベントリ:
KubeNodeInventory - Kubernetes イベント:
KubeEvents - Kubernetes クラスター内のサービス:
KubeServices - Kubernetes クラスターのノード部分のパフォーマンス メトリック:
Perf | where ObjectName == “K8SNode” - Kubernetes クラスターのコンテナー部分のパフォーマンス メトリック:
Perf | where ObjectName == “K8SContainer” - カスタム メトリック:
InsightsMetrics
Prometheus サポート
Azure Monitor for containers では Prometheus サーバー無しで Prometheusメトリックを収集することができる。しかし、残念ながら Workbook にPrometheusメトリクスのクエリが無いので、ここでは設定方法とクエリ方法のドキュメントを紹介するにとどめる。
構成方法
https://docs.microsoft.com/ja-jp/azure/azure-monitor/insights/container-insights-prometheus-integration#query-prometheus-metrics-data
ConfigMap にメトリクス収集するための設定を構成するだけである。
クエリ方法
Prometheus メトリック データのクエリを実行する
InsightsMetrics の名前空間 prometheus をフィルタすると、同様に Tags プロパティの中に JSON でメトリクスが入っている。
InsightsMetrics
| where Namespace == "prometheus"
| extend tags=parse_json(Tags)
| summarize count() by Name
Workbooks を開く
Azure Kubernetes Service 選択後の左メニュー > 分析情報(Insights) > 右上セレクトボックス View Workbooks

それでは、各Workbookを順番に見て行こう。
ディスク容量 Workbooks
以下6行はグラフ合計値表示のためのおまじないだと思って読み飛ばしてよい。ディスク容量の全グラフで共通である。
| extend Tags = todynamic(Tags)
| extend HostName = tostring(Tags.hostName), Device = strcat('/dev/', tostring(Tags.device))
| extend NodeDisk = strcat(HostName, Device)
| where "*" in ('*') or HostName in ('*')
| where "*" in ('*') or Device in ('*')
| where NodeDisk in (selectedStateDisks) or '*' in (selectedStateDisks);
注目ポイント
where Origin == 'container.azm.ms/telegraf'- ディスク容量関連は、
where Namespace == 'disk' or Namespace =~ 'container.azm.ms/disk'
Top 3 Disks by Used Disk %
let selectedStateDisks = dynamic(["*"]);
let data = InsightsMetrics
| where Origin == 'container.azm.ms/telegraf'
| where Namespace == 'disk' or Namespace =~ 'container.azm.ms/disk'
| where Name == 'used_percent'
| extend Tags = todynamic(Tags)
| extend HostName = tostring(Tags.hostName), Device = strcat('/dev/', tostring(Tags.device))
| extend NodeDisk = strcat(HostName, Device)
| where "*" in ('*') or HostName in ('*')
| where "*" in ('*') or Device in ('*')
| where NodeDisk in (selectedStateDisks) or '*' in (selectedStateDisks);
let mostUsedDisks = data
| top-nested 3 of NodeDisk by MaxVal = max(Val);
data
| where NodeDisk in (mostUsedDisks)
| make-series ['Used Disk %'] = max(Val) default = 0 on TimeGenerated from ago(21600s) to now() step 10m by NodeDisk
Disk Capacity Overview
Used Disk %
let selectedStateDisks = dynamic(["*"]);
let usedPercent = InsightsMetrics
| where Origin == 'container.azm.ms/telegraf'
| where Namespace == 'disk' or Namespace =~ 'container.azm.ms/disk'
| where Name == 'used_percent'
| extend Tags = todynamic(Tags)
| extend HostName = tostring(Tags.hostName), Device = strcat('/dev/', tostring(Tags.device))
| extend NodeDisk = strcat(HostName, Device)
| where "*" in ('*') or HostName in ('*')
| where "*" in ('*') or Device in ('*')
| where NodeDisk in (selectedStateDisks) or '*' in (selectedStateDisks);
let row = dynamic(
{
"Kind":"Unselected"});
let worstDiskAcrossNodes = usedPercent
| summarize UsedPercent = max(Val) by NodeDisk
| top 1 by UsedPercent desc;
usedPercent
| where (row.Kind == 'Unselected') or (row.Kind == 'Node' and row.Id == HostName) or (row.Kind == 'Device' and row.Id == NodeDisk)
| make-series ['Used Disk %'] = max(Val) default = 0 on TimeGenerated from ago(21600s) to now() step 10m by NodeDisk
| where NodeDisk contains iff(row.Kind == 'Unselected', toscalar(worstDiskAcrossNodes
| project NodeDisk), '')
Free Disk Space (GiB)
let selectedStateDisks = dynamic(["*"]);
let data = InsightsMetrics
| where Origin == 'container.azm.ms/telegraf'
| where Namespace == 'disk' or Namespace =~ 'container.azm.ms/disk'
| where Name == 'used_percent' or Name == 'free'
| extend Tags = todynamic(Tags)
| extend HostName = tostring(Tags.hostName), Device = strcat('/dev/', tostring(Tags.device))
| extend NodeDisk = strcat(HostName, Device)
| where "*" in ('*') or HostName in ('*')
| where "*" in ('*') or Device in ('*')
| where NodeDisk in (selectedStateDisks) or '*' in (selectedStateDisks);
let usedPercent = data
| where Name == 'used_percent';
let free = data
| where Name == 'free'
| extend Val = Val / 1073741824;
let row = dynamic(
{
"Kind":"Unselected"});
let worstDiskAcrossNodes = usedPercent
| summarize UsedPercent = max(Val) by NodeDisk
| top 1 by UsedPercent desc;
free
| where (row.Kind == 'Unselected') or (row.Kind == 'Node' and row.Id == HostName) or (row.Kind == 'Device' and row.Id == NodeDisk)
| make-series ['Free Disk Space'] = min(Val) default = 0 on TimeGenerated from ago(21600s) to now() step 10m by NodeDisk
| where NodeDisk contains iff(row.Kind == 'Unselected', toscalar(worstDiskAcrossNodes
| project NodeDisk), '')
ディスクIO Workbooks
以下7行はグラフ合計値表示のためのおまじないだと思って読み飛ばしてよい。ディスクIOの全グラフで共通である。
| extend Tags = todynamic(Tags)
| extend HostName = tostring(Tags.hostName), Device = strcat('/dev/', tostring(Tags.name))
| extend NodeDisk = strcat(HostName, Device)
| where '*' in ('*') or HostName in ('*')
| where '*' in ('*') or Device in ('*')
| order by NodeDisk asc, TimeGenerated asc
| serialize
注目ポイント
where Origin == 'container.azm.ms/telegraf'- ディスクIO関連は、
where Namespace == 'container.azm.ms/diskio'
Disk IO Overview
Read Bytes/sec
let bytesReadPerSec = InsightsMetrics
| where Origin == 'container.azm.ms/telegraf'
| where Namespace == 'container.azm.ms/diskio'
| where Name == 'read_bytes'
| extend Tags = todynamic(Tags)
| extend HostName = tostring(Tags.hostName), Device = strcat('/dev/', tostring(Tags.name))
| extend NodeDisk = strcat(HostName, Device)
| where '*' in ('*') or HostName in ('*')
| where '*' in ('*') or Device in ('*')
| order by NodeDisk asc, TimeGenerated asc
| serialize
| extend PrevVal = iif(prev(NodeDisk) != NodeDisk, 0.0, prev(Val)), PrevTimeGenerated = iif(prev(NodeDisk) != NodeDisk, datetime(null), prev(TimeGenerated))
| where isnotnull(PrevTimeGenerated) and PrevTimeGenerated != TimeGenerated
| extend Rate = iif(PrevVal > Val, Val / (datetime_diff('Second', TimeGenerated, PrevTimeGenerated) * 1), iif(PrevVal == Val, 0.0, (Val - PrevVal) / (datetime_diff('Second', TimeGenerated, PrevTimeGenerated) * 1)))
| where isnotnull(Rate)
| project TimeGenerated, HostName, Device, Rate;
let maxOn = indexof("Average", 'Max');
let avgOn = indexof("Average", 'Average');
let minOn = indexof("Average", 'Min');
bytesReadPerSec
| make-series Val = iif(avgOn != -1, avg(Rate), iif(maxOn != -1, max(Rate), min(Rate))) default=0 on TimeGenerated from ago(21600s) to now() step 10m by HostName, Device
| extend Name = strcat(HostName, Device)
| project-away HostName, Device
Write Bytes/sec
let bytesWritePerSec = InsightsMetrics
| where Origin == 'container.azm.ms/telegraf'
| where Namespace == 'container.azm.ms/diskio'
| where Name == 'write_bytes'
| extend Tags = todynamic(Tags)
| extend HostName = tostring(Tags.hostName), Device = strcat('/dev/', tostring(Tags.name))
| extend NodeDisk = strcat(HostName, Device)
| where '*' in ('*') or HostName in ('*')
| where '*' in ('*') or Device in ('*')
| order by NodeDisk asc, TimeGenerated asc
| serialize
| extend PrevVal = iif(prev(NodeDisk) != NodeDisk, 0.0, prev(Val)), PrevTimeGenerated = iif(prev(NodeDisk) != NodeDisk, datetime(null), prev(TimeGenerated))
| where isnotnull(PrevTimeGenerated) and PrevTimeGenerated != TimeGenerated
| extend Rate = iif(PrevVal > Val, Val / (datetime_diff('Second', TimeGenerated, PrevTimeGenerated) * 1), iif(PrevVal == Val, 0.0, (Val - PrevVal) / (datetime_diff('Second', TimeGenerated, PrevTimeGenerated) * 1)))
| where isnotnull(Rate)
| project TimeGenerated, HostName, Device, Rate;
let maxOn = indexof("Average", 'Max');
let avgOn = indexof("Average", 'Average');
let minOn = indexof("Average", 'Min');
bytesWritePerSec
| make-series Val = iif(avgOn != -1, avg(Rate), iif(maxOn != -1, max(Rate), min(Rate))) default=0 on TimeGenerated from ago(21600s) to now() step 10m by HostName, Device
| extend Name = strcat(HostName, Device)
| project-away HostName, Device
Total Bytes Read (10m intervals)
let bytesReadTotal = InsightsMetrics
| where Origin == 'container.azm.ms/telegraf'
| where Namespace == 'container.azm.ms/diskio'
| where Name == 'read_bytes'
| extend Tags = todynamic(Tags)
| extend HostName = tostring(Tags.hostName), Device = strcat('/dev/', tostring(Tags.name))
| extend NodeDisk = strcat(HostName, Device)
| where '*' in ('*') or HostName in ('*')
| where '*' in ('*') or Device in ('*')
| order by NodeDisk asc, TimeGenerated asc
| serialize
| extend PrevVal = iif(prev(NodeDisk) != NodeDisk, 0.0, prev(Val)), PrevTimeGenerated = iif(prev(NodeDisk) != NodeDisk, datetime(null), prev(TimeGenerated))
| where isnotnull(PrevTimeGenerated) and PrevTimeGenerated != TimeGenerated
| extend Rate = iif(PrevVal > Val, Val / 1, iif(PrevVal == Val, 0.0, (Val - PrevVal) / 1))
| where isnotnull(Rate)
| project TimeGenerated, HostName, Device, Rate;
let sum = bytesReadTotal
| make-series Val = sum(Rate) default=0 on TimeGenerated from ago(21600s) to now() step 10m by HostName, Device
| extend Name = strcat(HostName, Device)
| project-away HostName, Device;
sum
Total Bytes Written (10m intervals)
let bytesWrittenTotal = InsightsMetrics
| where Origin == 'container.azm.ms/telegraf'
| where Namespace == 'container.azm.ms/diskio'
| where Name == 'write_bytes'
| extend Tags = todynamic(Tags)
| extend HostName = tostring(Tags.hostName), Device = strcat('/dev/', tostring(Tags.name))
| extend NodeDisk = strcat(HostName, Device)
| where '*' in ('*') or HostName in ('*')
| where '*' in ('*') or Device in ('*')
| order by NodeDisk asc, TimeGenerated asc
| serialize
| extend PrevVal = iif(prev(NodeDisk) != NodeDisk, 0.0, prev(Val)), PrevTimeGenerated = iif(prev(NodeDisk) != NodeDisk, datetime(null), prev(TimeGenerated))
| where isnotnull(PrevTimeGenerated) and PrevTimeGenerated != TimeGenerated
| extend Rate = iif(PrevVal > Val, Val / 1, iif(PrevVal == Val, 0.0, (Val - PrevVal) / 1))
| where isnotnull(Rate)
| project TimeGenerated, HostName, Device, Rate;
let sum = bytesWrittenTotal
| make-series Val = sum(Rate) default=0 on TimeGenerated from ago(21600s) to now() step 10m by HostName, Device
| extend Name = strcat(HostName, Device)
| project-away HostName, Device;
sum
Milliseconds Per Bytes Read
let msPerByteRead = InsightsMetrics
| where Origin == 'container.azm.ms/telegraf'
| where Namespace == 'container.azm.ms/diskio'
| where Name == 'read_bytes'
| extend Tags = todynamic(Tags)
| extend HostName = tostring(Tags.hostName), Device = strcat('/dev/', tostring(Tags.name))
| extend NodeDisk = strcat(HostName, Device)
| where '*' in ('*') or HostName in ('*')
| where '*' in ('*') or Device in ('*')
| order by NodeDisk asc, TimeGenerated asc
| serialize
| extend PrevVal = iif(prev(NodeDisk) != NodeDisk, 0.0, prev(Val)), PrevTimeGenerated = iif(prev(NodeDisk) != NodeDisk, datetime(null), prev(TimeGenerated))
| where isnotnull(PrevTimeGenerated) and PrevTimeGenerated != TimeGenerated
| extend Rate = iif(PrevVal > Val, pow(Val / (datetime_diff('Second', TimeGenerated, PrevTimeGenerated) * 1000 * 1), -1), pow((Val - PrevVal) / (datetime_diff('Second', TimeGenerated, PrevTimeGenerated) * 1000 * 1), -1))
| where isnotnull(Rate)
| project TimeGenerated, HostName, Device, Rate;
let maxOn = indexof("Average", 'Max');
let avgOn = indexof("Average", 'Average');
let minOn = indexof("Average", 'Min');
msPerByteRead
| make-series Val = iif(avgOn != -1, avg(Rate), iif(maxOn != -1, max(Rate), min(Rate))) default=0 on TimeGenerated from ago(21600s) to now() step 10m by HostName, Device
| extend Name = strcat(HostName, Device)
| project-away HostName, Device
Milliseconds Per Bytes Written
let msPerByteWritten = InsightsMetrics
| where Origin == 'container.azm.ms/telegraf'
| where Namespace == 'container.azm.ms/diskio'
| where Name == 'write_bytes'
| extend Tags = todynamic(Tags)
| extend HostName = tostring(Tags.hostName), Device = strcat('/dev/', tostring(Tags.name))
| extend NodeDisk = strcat(HostName, Device)
| where '*' in ('*') or HostName in ('*')
| where '*' in ('*') or Device in ('*')
| order by NodeDisk asc, TimeGenerated asc
| serialize
| extend PrevVal = iif(prev(NodeDisk) != NodeDisk, 0.0, prev(Val)), PrevTimeGenerated = iif(prev(NodeDisk) != NodeDisk, datetime(null), prev(TimeGenerated))
| where isnotnull(PrevTimeGenerated) and PrevTimeGenerated != TimeGenerated
| extend Rate = iif(TimeGenerated == PrevTimeGenerated or (Val - PrevVal) == 0, 0.0, iif(PrevVal > Val, pow(Val / (datetime_diff('Second', TimeGenerated, PrevTimeGenerated) * 1000 * 1), -1), pow((Val - PrevVal) / (datetime_diff('Second', TimeGenerated, PrevTimeGenerated) * 1000 * 1), -1)))
| where isnotnull(Rate)
| project TimeGenerated, HostName, Device, Rate;
let maxOn = indexof("Average", 'Max');
let avgOn = indexof("Average", 'Average');
let minOn = indexof("Average", 'Min');
msPerByteWritten
| make-series Val = iif(avgOn != -1, avg(Rate), iif(maxOn != -1, max(Rate), min(Rate))) default=0 on TimeGenerated from ago(21600s) to now() step 10m by HostName, Device
| extend Name = strcat(HostName, Device)
| project-away HostName, Device
IOPS In Progress
let iops = InsightsMetrics
| where Origin == 'container.azm.ms/telegraf'
| where Namespace == 'container.azm.ms/diskio'
| where Name == 'iops_in_progress'
| extend Tags = todynamic(Tags)
| extend HostName = tostring(Tags.hostName), Device = strcat('/dev/', tostring(Tags.name))
| extend NodeDisk = strcat(HostName, Device)
| where '*' in ('*') or HostName in ('*')
| where '*' in ('*') or Device in ('*')
| project TimeGenerated, HostName, Device, Val;
let maxOn = indexof("Average", 'Max');
let avgOn = indexof("Average", 'Average');
let minOn = indexof("Average", 'Min');
iops
| make-series Val = iif(avgOn != -1, avg(Val), iif(maxOn != -1, max(Val), min(Val))) default=0 on TimeGenerated from ago(21600s) to now() step 10m by HostName, Device
| extend Name = strcat(HostName, Device)
| project-away HostName, Device
% Disk Busy
let ioTime = InsightsMetrics
| where Origin == 'container.azm.ms/telegraf'
| where Namespace == 'container.azm.ms/diskio'
| where Name == 'io_time'
| extend Tags = todynamic(Tags)
| extend HostName = tostring(Tags.hostName), Device = strcat('/dev/', tostring(Tags.name))
| extend NodeDisk = strcat(HostName, Device)
| where '*' in ('*') or HostName in ('*')
| where '*' in ('*') or Device in ('*')
| order by NodeDisk asc, TimeGenerated asc
| serialize
| extend PrevVal = iif(prev(NodeDisk) != NodeDisk, 0.0, prev(Val)), PrevTimeGenerated = iif(prev(NodeDisk) != NodeDisk, datetime(null), prev(TimeGenerated))
| where isnotnull(PrevTimeGenerated) and PrevTimeGenerated != TimeGenerated
| extend Rate = iif(PrevVal > Val, Val / (datetime_diff('Second', TimeGenerated, PrevTimeGenerated) * 1000), (Val - PrevVal) / (datetime_diff('Second', TimeGenerated, PrevTimeGenerated) * 1000)) * 100
| where isnotnull(Rate)
| project TimeGenerated, NodeDisk, Rate;
let maxOn = indexof("Average", 'Max');
let avgOn = indexof("Average", 'Average');
let minOn = indexof("Average", 'Min');
ioTime
| make-series Val = iif(avgOn != -1, avg(Rate), iif(maxOn != -1, max(Rate), min(Rate))) default=0 on TimeGenerated from ago(21600s) to now() step 10m by NodeDisk
| extend Name = NodeDisk
| project-away NodeDisk
Kubelet Workbooks
注目ポイント
where Origin == 'container.azm.ms/telegraf'- Kubelet 関連は、
where Namespace == 'container.azm.ms/prometheus'
Overview By Node
let data = InsightsMetrics
| where Origin == 'container.azm.ms/telegraf'
| where Namespace == 'container.azm.ms/prometheus'
| where Name == 'kubelet_docker_operations' or Name == 'kubelet_docker_operations_errors'
| extend Tags = todynamic(Tags)
| extend OperationType = tostring(Tags['operation_type']), HostName = tostring(Tags.hostName)
| where '*' in ('aks-agentpool-14531005-0','aks-agentpool-14531005-1','aks-agentpool-14531005-2') or HostName in ('aks-agentpool-14531005-0','aks-agentpool-14531005-1','aks-agentpool-14531005-2')
| where '*' in ('*') or OperationType in ('*')
| extend partitionKey = strcat(HostName, '/' , Name, '/', OperationType)
| order by partitionKey asc, TimeGenerated asc
| serialize
| extend PrevVal = iif(prev(partitionKey) != partitionKey, 0.0, prev(Val)), PrevTimeGenerated = iif(prev(partitionKey) != partitionKey, datetime(null), prev(TimeGenerated))
| where isnotnull(PrevTimeGenerated) and PrevTimeGenerated != TimeGenerated
| extend Rate = iif(PrevVal > Val, Val, Val - PrevVal)
| where isnotnull(Rate)
| project TimeGenerated, Name, HostName, Rate;
let operationData = data
| where Name == 'kubelet_docker_operations';
let totalOperationsByNode = operationData
| summarize Rate = sum(Rate) by HostName
| project HostName, TotalOperations = Rate;
let totalOperationsByNodeSeries = operationData
| make-series TotalOperationsSeries = sum(Rate) default = 0 on TimeGenerated from ago(21600s) to now() step 10m by HostName
| project-away TimeGenerated;
let errorData = data
| where Name == 'kubelet_docker_operations_errors';
let totalErrorsByNode = errorData
| summarize Rate = sum(Rate) by HostName
| project HostName, TotalErrors = Rate;
let totalErrorsByNodeSeries = errorData
| make-series TotalErrorsSeries = sum(Rate) default = 0 on TimeGenerated from ago(21600s) to now() step 10m by HostName
| project-away TimeGenerated;
totalOperationsByNode
| join kind=inner
(
totalErrorsByNode
)
on HostName
| join kind = inner
(
totalOperationsByNodeSeries
)
on HostName
| join kind = inner
(
totalErrorsByNodeSeries
)
on HostName
| project-away HostName1, HostName2, HostName3
| extend TotalSuccessfulOperationsSeries = series_subtract(TotalOperationsSeries, TotalErrorsSeries)
| extend SuccessPercentage = round(iif(TotalOperations == 0, 1.0, 1 - (TotalErrors / TotalOperations)), 4), SuccessPercentageSeries = series_divide(TotalSuccessfulOperationsSeries, TotalOperationsSeries)
| extend SeriesOfEqualLength = range(1, array_length(TotalOperationsSeries), 1)
| extend SeriesOfOneHundo = series_multiply(series_divide(SeriesOfEqualLength, SeriesOfEqualLength), 100)
| extend SuccessfulOperationsEqualsTotalOperationsSeries = series_equals(TotalSuccessfulOperationsSeries, TotalOperationsSeries)
| extend SuccessPercentageSeries = array_iff(SuccessfulOperationsEqualsTotalOperationsSeries, SeriesOfOneHundo, SuccessPercentageSeries)
| project HostName, TotalOperations, TotalErrors, SuccessPercentage, SuccessPercentageSeries
| order by SuccessPercentage asc, HostName asc
| project-rename Node = HostName, ['Total Operations'] = TotalOperations, ['Total Errors'] = TotalErrors, ['Success %'] = SuccessPercentage, ['Success % Trend'] = SuccessPercentageSeries
Overview By Operation Type
let data = InsightsMetrics
| where Origin == 'container.azm.ms/telegraf'
| where Namespace == 'container.azm.ms/prometheus'
| where Name == 'kubelet_docker_operations' or Name == 'kubelet_docker_operations_errors'
| extend Tags = todynamic(Tags)
| extend OperationType = tostring(Tags['operation_type']), HostName = tostring(Tags.hostName)
| where '*' in ('aks-agentpool-14531005-0','aks-agentpool-14531005-1','aks-agentpool-14531005-2') or HostName in ('aks-agentpool-14531005-0','aks-agentpool-14531005-1','aks-agentpool-14531005-2')
| where '*' in ('*') or OperationType in ('*')
| extend partitionKey = strcat(HostName, '/' , Name, '/', OperationType)
| order by partitionKey asc, TimeGenerated asc
| serialize
| extend PrevVal = iif(prev(partitionKey) != partitionKey, 0.0, prev(Val)), PrevTimeGenerated = iif(prev(partitionKey) != partitionKey, datetime(null), prev(TimeGenerated))
| where isnotnull(PrevTimeGenerated) and PrevTimeGenerated != TimeGenerated
| extend Rate = iif(PrevVal > Val, Val, Val - PrevVal)
| where isnotnull(Rate)
| project TimeGenerated, Name, OperationType, Rate;
let operationData = data
| where Name == 'kubelet_docker_operations';
let totalOperationsByType = operationData
| summarize Rate = sum(Rate) by OperationType
| project OperationType, TotalOperations = Rate;
let totalOperationsByTypeSeries = operationData
| make-series TotalOperationsByTypeSeries = sum(Rate) default = 0 on TimeGenerated from ago(21600s) to now() step 10m by OperationType
| project-away TimeGenerated;
let errorsData = data
| where Name == 'kubelet_docker_operations_errors';
let totalErrorsByType = errorsData
| summarize Rate = sum(Rate) by OperationType
| project OperationType, TotalErrors = Rate;
let totalErrorsByTypeSeries = errorsData
| make-series TotalErrorsByTypeSeries = sum(Rate) default = 0 on TimeGenerated from ago(21600s) to now() step 10m by OperationType
| project-away TimeGenerated;
let seriesLength = toscalar( totalErrorsByTypeSeries
| extend ArrayLength = array_length(TotalErrorsByTypeSeries)
| summarize Array_Length = max(ArrayLength) );
totalOperationsByType
| join kind=leftouter
(
totalErrorsByType
)
on OperationType
| project-away OperationType1
| extend TotalErrors = iif(isempty(TotalErrors), 0.0, TotalErrors)
| join kind=leftouter
(
totalErrorsByTypeSeries
)
on OperationType
| project-away OperationType1
| extend SeriesOfEqualLength = range(1, seriesLength, 1)
| extend SeriesOfZeroes = series_subtract(SeriesOfEqualLength, SeriesOfEqualLength)
| extend SeriesOfOneHundo = series_multiply(series_divide(SeriesOfEqualLength, SeriesOfEqualLength), 100)
| extend TotalErrorsByTypeSeries = iif(isempty(TotalErrorsByTypeSeries), SeriesOfZeroes, TotalErrorsByTypeSeries)
| join kind=leftouter
(
totalOperationsByTypeSeries
)
on OperationType
| project-away OperationType1
| extend TotalSuccessfulOperationsByTypeSeries = series_subtract(TotalOperationsByTypeSeries, TotalErrorsByTypeSeries)
| extend SuccessPercentage = round(iif(TotalOperations == 0, 1.0, 1 - (TotalErrors / TotalOperations)), 4), SuccessPercentageSeries = series_divide(TotalSuccessfulOperationsByTypeSeries, TotalOperationsByTypeSeries)
| extend SuccessfulOperationsEqualsTotalOperationsSeries = series_equals(TotalSuccessfulOperationsByTypeSeries, TotalOperationsByTypeSeries)
| extend SuccessPercentageSeries = array_iff(SuccessfulOperationsEqualsTotalOperationsSeries, SeriesOfOneHundo, SuccessPercentageSeries)
| project OperationType, TotalOperations, TotalErrors, SuccessPercentage, SuccessPercentageSeries
| order by SuccessPercentage asc, OperationType asc
| project-rename ['Operation Type'] = OperationType, ['Total Operations'] = TotalOperations, ['Total Errors'] = TotalErrors, ['Success %'] = SuccessPercentage, ['Success % Trend'] = SuccessPercentageSeries
ネットワーク Workbooks
以下7行はグラフ合計値表示のためのおまじないだと思って読み飛ばしてよい。ネットワークの全グラフで共通である。
| extend Tags = todynamic(Tags)
| extend HostName = tostring(Tags.hostName), Interface = tostring(Tags.interface)
| where '*' in ('*') or HostName in ('*')
| where '*' in ('*') or Interface in ('*')
| extend partitionKey = strcat(HostName, '/', Interface)
| order by partitionKey asc, TimeGenerated asc
| serialize
注目ポイント
where Origin == 'container.azm.ms/telegraf'- ネットワーク関連は、
where Namespace == 'container.azm.ms/net'
Network Overview
(クエリが多いので割愛)
Sent Bytes/sec
let bytesSentPerSecond = InsightsMetrics
| where Origin == 'container.azm.ms/telegraf'
| where Namespace == 'container.azm.ms/net'
| where Name == 'bytes_sent'
| extend Tags = todynamic(Tags)
| extend HostName = tostring(Tags.hostName), Interface = tostring(Tags.interface)
| where '*' in ('*') or HostName in ('*')
| where '*' in ('*') or Interface in ('*')
| extend partitionKey = strcat(HostName, '/', Interface)
| order by partitionKey asc, TimeGenerated asc
| serialize
| extend PrevVal = iif(prev(partitionKey) != partitionKey, 0.0, prev(Val)), PrevTimeGenerated = iif(prev(partitionKey) != partitionKey, datetime(null), prev(TimeGenerated))
| where isnotnull(PrevTimeGenerated) and PrevTimeGenerated != TimeGenerated
| extend Rate = iif(PrevVal > Val, Val / (datetime_diff('Second', TimeGenerated, PrevTimeGenerated) * 1), (Val - PrevVal) / (datetime_diff('Second', TimeGenerated, PrevTimeGenerated) * 1))
| where isnotnull(Rate)
| project TimeGenerated, HostName, Interface, Rate;
let maxOn = indexof("Average", 'Max');
let avgOn = indexof("Average", 'Average');
let minOn = indexof("Average", 'Min');
bytesSentPerSecond
| make-series Val = iif(avgOn != -1, avg(Rate), iif(maxOn != -1, max(Rate), min(Rate))) default=0 on TimeGenerated from ago(21600s) to now() step 10m by HostName, Interface
| extend Name = strcat(HostName, Interface)
| project-away HostName, Interface
Received Bytes/sec
let bytesReceivedPerSecond = InsightsMetrics
| where Origin == 'container.azm.ms/telegraf'
| where Namespace == 'container.azm.ms/net'
| where Name == 'bytes_recv'
| extend Tags = todynamic(Tags)
| extend HostName = tostring(Tags.hostName), Interface = tostring(Tags.interface)
| where '*' in ('*') or HostName in ('*')
| where '*' in ('*') or Interface in ('*')
| extend partitionKey = strcat(HostName, '/', Interface)
| order by partitionKey asc, TimeGenerated asc
| serialize
| extend PrevVal = iif(prev(partitionKey) != partitionKey, 0.0, prev(Val)), PrevTimeGenerated = iif(prev(partitionKey) != partitionKey, datetime(null), prev(TimeGenerated))
| where isnotnull(PrevTimeGenerated) and PrevTimeGenerated != TimeGenerated
| extend Rate = iif(PrevVal > Val, Val / (datetime_diff('Second', TimeGenerated, PrevTimeGenerated) * 1), (Val - PrevVal) / (datetime_diff('Second', TimeGenerated, PrevTimeGenerated) * 1))
| where isnotnull(Rate)
| project TimeGenerated, HostName, Interface, Rate;
let maxOn = indexof("Average", 'Max');
let avgOn = indexof("Average", 'Average');
let minOn = indexof("Average", 'Min');
bytesReceivedPerSecond
| make-series Val = iif(avgOn != -1, avg(Rate), iif(maxOn != -1, max(Rate), min(Rate))) default=0 on TimeGenerated from ago(21600s) to now() step 10m by HostName, Interface
| extend Name = strcat(HostName, Interface)
| project-away HostName, Interface
Total Bytes Sent (by 10m intervals)
let bytesSentTotal = InsightsMetrics
| where Origin == 'container.azm.ms/telegraf'
| where Namespace == 'container.azm.ms/net'
| where Name == 'bytes_sent'
| extend Tags = todynamic(Tags)
| extend HostName = tostring(Tags.hostName), Interface = tostring(Tags.interface)
| where '*' in ('*') or HostName in ('*')
| where '*' in ('*') or Interface in ('*')
| extend partitionKey = strcat(HostName, '/', Interface)
| order by partitionKey asc, TimeGenerated asc
| serialize
| extend PrevVal = iif(prev(partitionKey) != partitionKey, 0.0, prev(Val)), PrevTimeGenerated = iif(prev(partitionKey) != partitionKey, datetime(null), prev(TimeGenerated))
| where isnotnull(PrevTimeGenerated) and PrevTimeGenerated != TimeGenerated
| extend Rate = iif(PrevVal > Val, Val / 1, (Val - PrevVal) / 1)
| where isnotnull(Rate)
| project TimeGenerated, HostName, Interface, Rate;
bytesSentTotal
| make-series Val = sum(Rate) default = 0 on TimeGenerated from ago(21600s) to now() step 10m by HostName, Interface
| extend Name = strcat(HostName, '/', Interface)
| project-away HostName, Interface
Total Bytes Received (by 10m intervals)
let bytesReceivedTotal = InsightsMetrics
| where Origin == 'container.azm.ms/telegraf'
| where Namespace == 'container.azm.ms/net'
| where Name == 'bytes_recv'
| extend Tags = todynamic(Tags)
| extend HostName = tostring(Tags.hostName), Interface = tostring(Tags.interface)
| where '*' in ('*') or HostName in ('*')
| where '*' in ('*') or Interface in ('*')
| extend partitionKey = strcat(HostName, '/', Interface)
| order by partitionKey asc, TimeGenerated asc
| serialize
| extend PrevVal = iif(prev(partitionKey) != partitionKey, 0.0, prev(Val)), PrevTimeGenerated = iif(prev(partitionKey) != partitionKey, datetime(null), prev(TimeGenerated))
| where isnotnull(PrevTimeGenerated) and PrevTimeGenerated != TimeGenerated
| extend Rate = iif(PrevVal > Val, Val / 1, (Val - PrevVal) / 1)
| where isnotnull(Rate)
| project TimeGenerated, HostName, Rate;
let sum = bytesReceivedTotal
| make-series Val = sum(Rate) default=0 on TimeGenerated from ago(21600s) to now() step 10m by HostName
| extend Name = strcat(HostName, ':', 'Sum')
| project-away HostName;
sum
Errors Out/sec
※キャプチャ取った際に0件だったので画面は省略
let errorsOutPerSecond = InsightsMetrics
| where Origin == 'container.azm.ms/telegraf'
| where Namespace == 'container.azm.ms/net'
| where Name == 'err_out'
| extend Tags = todynamic(Tags)
| extend HostName = tostring(Tags.hostName), Interface = tostring(Tags.interface)
| where '*' in ('*') or HostName in ('*')
| where '*' in ('*') or Interface in ('*')
| extend partitionKey = strcat(HostName, '/', Interface)
| order by partitionKey asc, TimeGenerated asc
| serialize
| extend PrevVal = iif(prev(partitionKey) != partitionKey, 0.0, prev(Val)), PrevTimeGenerated = iif(prev(partitionKey) != partitionKey, datetime(null), prev(TimeGenerated))
| where isnotnull(PrevTimeGenerated) and PrevTimeGenerated != TimeGenerated
| extend Rate = iif(PrevVal > Val, Val / datetime_diff('Second', TimeGenerated, PrevTimeGenerated), (Val - PrevVal) / datetime_diff('Second', TimeGenerated, PrevTimeGenerated))
| where isnotnull(Rate)
| project TimeGenerated, HostName, Interface, Rate;
let maxOn = indexof("Average", 'Max');
let avgOn = indexof("Average", 'Average');
let minOn = indexof("Average", 'Min');
errorsOutPerSecond
| make-series Val = iif(avgOn != -1, avg(Rate), iif(maxOn != -1, max(Rate), min(Rate))) default=0 on TimeGenerated from ago(21600s) to now() step 10m by HostName, Interface
| extend Name = strcat(HostName, Interface)
| project-away HostName, Interface
Errors In/sec
※キャプチャ取った際に0件だったので画面は省略
let errorsInPerSecond = InsightsMetrics
| where Origin == 'container.azm.ms/telegraf'
| where Namespace == 'container.azm.ms/net'
| where Name == 'err_in'
| extend Tags = todynamic(Tags)
| extend HostName = tostring(Tags.hostName), Interface = tostring(Tags.interface)
| where '*' in ('*') or HostName in ('*')
| where '*' in ('*') or Interface in ('*')
| extend partitionKey = strcat(HostName, '/', Interface)
| order by partitionKey asc, TimeGenerated asc
| serialize
| extend PrevVal = iif(prev(partitionKey) != partitionKey, 0.0, prev(Val)), PrevTimeGenerated = iif(prev(partitionKey) != partitionKey, datetime(null), prev(TimeGenerated))
| where isnotnull(PrevTimeGenerated) and PrevTimeGenerated != TimeGenerated
| extend Rate = iif(PrevVal > Val, Val / datetime_diff('Second', TimeGenerated, PrevTimeGenerated), (Val - PrevVal) / datetime_diff('Second', TimeGenerated, PrevTimeGenerated))
| where isnotnull(Rate)
| project TimeGenerated, HostName, Interface, Rate;
let maxOn = indexof("Average", 'Max');
let avgOn = indexof("Average", 'Average');
let minOn = indexof("Average", 'Min');
errorsInPerSecond
| make-series Val = iif(avgOn != -1, avg(Rate), iif(maxOn != -1, max(Rate), min(Rate))) default=0 on TimeGenerated from ago(21600s) to now() step 10m by HostName, Interface
| extend Name = strcat(HostName, Interface)
| project-away HostName, Interface
Total Errors Out (by 10m intervals)
※キャプチャ取った際に0件だったので画面は省略
let totalErrorsOut = InsightsMetrics
| where Origin == 'container.azm.ms/telegraf'
| where Namespace == 'container.azm.ms/net'
| where Name == 'err_out'
| extend Tags = todynamic(Tags)
| extend HostName = tostring(Tags.hostName), Interface = tostring(Tags.interface)
| where '*' in ('*') or HostName in ('*')
| where '*' in ('*') or Interface in ('*')
| extend partitionKey = strcat(HostName, '/', Interface)
| order by partitionKey asc, TimeGenerated asc
| serialize
| extend PrevVal = iif(prev(partitionKey) != partitionKey, 0.0, prev(Val)), PrevTimeGenerated = iif(prev(partitionKey) != partitionKey, datetime(null), prev(TimeGenerated))
| where isnotnull(PrevTimeGenerated) and PrevTimeGenerated != TimeGenerated
| extend Rate = iif(PrevVal > Val, Val, Val - PrevVal)
| where isnotnull(Rate)
| project TimeGenerated, HostName, Interface, Rate;
totalErrorsOut
| make-series Val = sum(Rate) default = 0 on TimeGenerated from ago(21600s) to now() step 10m by HostName, Interface
| extend Name = strcat(HostName, '/', Interface)
| project-away HostName, Interface
Total Errors In (by 10m intervals)
let totalErrorsIn = InsightsMetrics
| where Origin == 'container.azm.ms/telegraf'
| where Namespace == 'container.azm.ms/net'
| where Name == 'err_in'
| extend Tags = todynamic(Tags)
| extend HostName = tostring(Tags.hostName), Interface = tostring(Tags.interface)
| where '*' in ('*') or HostName in ('*')
| where '*' in ('*') or Interface in ('*')
| extend partitionKey = strcat(HostName, '/', Interface)
| order by partitionKey asc, TimeGenerated asc
| serialize
| extend PrevVal = iif(prev(partitionKey) != partitionKey, 0.0, prev(Val)), PrevTimeGenerated = iif(prev(partitionKey) != partitionKey, datetime(null), prev(TimeGenerated))
| where isnotnull(PrevTimeGenerated) and PrevTimeGenerated != TimeGenerated
| extend Rate = iif(PrevVal > Val, Val, Val - PrevVal)
| where isnotnull(Rate)
| project TimeGenerated, HostName, Interface, Rate;
totalErrorsIn
| make-series Val = sum(Rate) default = 0 on TimeGenerated from ago(21600s) to now() step 10m by HostName, Interface
| extend Name = strcat(HostName, '/', Interface)
| project-away HostName, Interface
参照情報まとめ
- Docs: Azure Monitor for containers を有効にする方法
- Docs: Azure Monitor で Log Analytics の使用を開始する
- Docs: Azure Monitor でログ クエリの使用を開始する
- Docs: Azure Monitor for containers からログを照会する方法
- Docs: Linux VM のカスタム メトリックを InfluxData Telegraf エージェントを使用して収集する
- Docs: Azure Monitor ブックを使用した対話型レポートの作成
- Qiita: Azure Monitor for containersが晴れてGAしました!




















