New Relicでコスト削減、障害対応の迅速化、AI活用にチャレンジ！オリジナルHHKBが当たる？！ by New Relic Advent Calendar 2025

Windows の GPU を New Relic で監視してみた

Last updated at 2025-12-28Posted at 2025-12-28

New Relic で Windows の GPU 監視

最近は、Windows の GPU でローカル LLM なんていうこともやることが多くなってきていると思うので、GPU が燃え尽きないように監視も大切ということで、監視させてみたいと思います。最初、GPU のデータ取得のあたりがうまくいかなかったので、試行錯誤した結果です。

Step 1: New Relic Infrastructure Agent のインストール

one.newrelic.com にログイン
+ Integrations & Agents → Guided install を選択
Windowsを選択し、表示されるPowerShellコマンドを管理者権限で実行

Step 2: PowerShellスクリプトの作成

フォルダを作成：

New-Item -ItemType Directory -Path "C:\newrelic-scripts" -Force

C:\newrelic-scripts\nvidia-smi-json.ps1 を作成：

$d = nvidia-smi --query-gpu=name,driver_version,index,fan.speed,pstate,memory.total,memory.used,memory.free,utilization.gpu,utilization.memory,temperature.gpu,power.draw,power.limit,clocks.current.graphics,clocks.current.memory --format=csv,noheader,nounits
$v = $d -split ',\s*'
@{
    gpu_name=$v[0]
    driver_version=$v[1]
    gpu_index=[int]$v[2]
    fan_speed=[int]$v[3]
    pstate=$v[4]
    memory_total=[int]$v[5]
    memory_used=[int]$v[6]
    memory_free=[int]$v[7]
    utilization_gpu=[int]$v[8]
    utilization_memory=[int]$v[9]
    temperature_gpu=[int]$v[10]
    power_draw=[float]$v[11]
    power_limit=[float]$v[12]
    clocks_graphics=[int]$v[13]
    clocks_memory=[int]$v[14]
} | ConvertTo-Json -Compress

動作確認：

powershell -ExecutionPolicy Bypass -File C:\newrelic-scripts\nvidia-smi-json.ps1

Step 3: Flex設定ファイルの作成

C:\Program Files\New Relic\newrelic-infra\integrations.d\nvidia-smi-gpu-monitoring.yml

---
integrations:
  - name: nri-flex
    interval: 30s
    config:
      name: NvidiaSMI
      apis:
        - name: NvidiaGpu
          commands:
            - run: powershell -ExecutionPolicy Bypass -File C:\newrelic-scripts\nvidia-smi-json.ps1

Step 4: 動作確認

cd "C:\Program Files\New Relic\newrelic-infra\newrelic-integrations"
.\nri-flex.exe -verbose -pretty -config_path "C:\Program Files\New Relic\newrelic-infra\integrations.d\nvidia-smi-gpu-monitoring.yml"

metrics に gpu_name, memory_used, temperature_gpu などが表示されればOK。

Step 5: Infrastructure Agentの再起動

Restart-Service newrelic-infra

Step 6: New Relic UIで確認

数分後、one.newrelic.com → Query Your Data で

SELECT * FROM NvidiaGpuSample SINCE 10 minutes ago

取得できるメトリクス

メトリクス	内容
`gpu_name`	GPU名
`utilization_gpu`	GPU使用率 (%)
`utilization_memory`	VRAMコントローラ使用率 (%)
`memory_used`	VRAM使用量 (MiB)
`memory_total`	VRAM総量 (MiB)
`memory_free`	VRAM空き (MiB)
`temperature_gpu`	GPU温度 (℃)
`power_draw`	消費電力 (W)
`power_limit`	電力制限 (W)
`fan_speed`	ファン速度 (%)
`clocks_graphics`	GPUクロック (MHz)
`clocks_memory`	メモリクロック (MHz)
`pstate`	パフォーマンス状態

LM Studio負荷監視用クエリ例

-- GPU・VRAM使用率の推移
SELECT average(utilization_gpu) AS 'GPU %', 
       average(memory_used) AS 'VRAM Used (MiB)',
       average(temperature_gpu) AS 'Temp (C)'
FROM NvidiaGpuSample 
TIMESERIES AUTO

まとめ

そこまで難しい感じではないですね。LLM 用のPCとか別にあるとかの場合は、特に監視とかしといた方が、何かあっても安心な感じはします。たまに、会社の自分PCに監視入れて、消すの忘れて、dev 環境とかにずっと監視が表示されてる人といたりするので、皆さんも注意しましょう。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up