Windowsパフォーマンスログをスクリプトで解析してみた

Last updated at 2024-06-24Posted at 2024-05-16

はじめに

Windowsサーバー(OS)のCPU、メモリ、ディスクI/Oどこに負荷がかかっているか確認するのに、Windowsのパフォーマンスログを確認する事があります。

しかし、複数台あるサーバー毎に、1日毎に出力されているパフォーマンスログを1週間や1か月分確認するのは非常に大変です。

そこで、Windowsのパフォーマンスログで確認したい内容を解析し、結果をまとめて出力するスクリプトを作成してみました。

パフォーマンスログの確認方法については、以下の記事も参考にしていただければと思います。

スクリプトの概要

Configファイルで指定したディレクトリに存在する、CSV形式で出力されたパフォーマンスログを1ファイル(File名の形式もConfigで指定)ずつ読み込みます。

各ログを解析し、Configで指定した文字列(パフォーマンスカウンター)を含む列が見つかったら、その列の値を収集します。

※Windowsのパフォーマンスログは、CSVファイル形式で出力すると、パフォーマンスカウンターの前にホスト名が付いているため、Configで指定した文字列(パフォーマンスカウンター)を含む列を解析対象にしています。

各列(パフォーマンスカウンター)の最小値、最大値、平均値、中央値、最小値を記録した日付、最大値を記録した日付を、Configで指定した解析結果出力ファイルに出力します。

※2024/05/22追記
最初に公開した記事では、複数のパフォーマンスログファイル全てにおける、各パフォーマンスカウンターの最小値、最大値、平均値、中央値、最小値を記録した日付、最大値を記録した日付を出力していましたが、ログディレクトリ(サーバー)毎の、日毎の各パフォーマンスカウンターの最小値、最大値、平均値を出力するスクリプトを追加しました。

Windowsパフォーマンスログで確認したい情報

今回、Windowsパフォーマンスログで確認したい項目として、以下を対象にしました。
Configファイルに指定するパフォーマンスカウンターとその値の意味について、以下にまとめます。

Configに指定するカウンター名	値の意味
System\Processor Queue Length	CPU待ち行列にあるスレッド数
Processor(_Total)% Processor Time	全体のCPU使用率
PhysicalDisk(_Total)\Avg. Disk Read Queue Length	ディスクのキューで待機している読取りリクエストの平均数
PhysicalDisk(_Total)\Avg. Disk Write Queue Length	ディスクのキューで待機している書込みリクエストの平均数
Memory\Available MBytes	メモリ空き容量
LogicalDisk(C:)% Free Space	Cドライブの空き容量
LogicalDisk(D:)% Free Space	Dドライブの空き容量
LogicalDisk(E:)% Free Space	Eドライブの空き容量
LogicalDisk(F:)% Free Space	Fドライブの空き容量

ドライブの空き容量については、実際のドライブ構成に合わせて指定します。
ただし、Configに、パフォーマンスログに存在しないパフォーマンスカウンターを指定しても、結果が出力されないだけで動作に支障はないため、サーバー毎にCPUコア数やドライブ構成が異なる場合は、最も多い想定でConfigにパフォーマンスカウンターを指定してください。

Configファイル

以下のように指定します。

Directoriesに、パフォーマンスログが置いてあるディレクトリを指定します(複数指定する場合はカンマ区切り)
OutputFilesに、「analyze_win_performance_log.py」の解析結果を出力するファイルパスを指定します(Directoriesを複数指定した場合は、OutputFilesも同じ数だけ指定)
OutputFilesPerDayに、「analyze_win_performance_log_per_day.py」の解析結果を出力するファイルパスを指定します(Directoriesを複数指定した場合は、OutputFilesも同じ数だけ指定)
FileNameには、パフォーマンスログのファイル形式を指定します
Columnsには、解析したいパフォーマンスカウンターの文字列を指定します(各カウンターで解析したい内容は上のセクションに記載しています)

#config.ini
[PerformanceMonitor]
Directories = C:/performance/XX/SystemA_AP#1,
              C:/performance/XX/SystemA_DB,
              C:/performance/XX/SystemB_AP#1,
              C:/performance/XX/SystemB_AP#2,
              C:/performance/XX/SystemB_DB

OutputFiles = C:/performance/XX/perfmon_result_systemA_ap#1.csv,
              C:/performance/XX/perfmon_result_systemA_db.csv,
              C:/performance/XX/perfmon_result_systemB_ap#1.csv,
              C:/performance/XX/perfmon_result_systemB_ap#2.csv,
              C:/performance/XX/perfmon_result_systemB_db.csv

OutputFilesPerDay = C:/performance/XX/perfmon_result_per_day_systemA_ap#1.csv,
                    C:/performance/XX/perfmon_result_per_day_systemA_db.csv,
                    C:/performance/XX/perfmon_result_per_day_systemB_ap#1.csv,
                    C:/performance/XX/perfmon_result_per_day_systemB_ap#2.csv,
                    C:/performance/XX/perfmon_result_per_day_systemB_db.csv

FileName = perf_day*.csv

Columns = System\Processor Queue Length,
          Processor(_Total)\% Processor Time,
          PhysicalDisk(_Total)\Avg. Disk Read Queue Length,
          PhysicalDisk(_Total)\Avg. Disk Write Queue Length,
          Memory\Available MBytes,
          LogicalDisk(C:)\% Free Space,
          LogicalDisk(D:)\% Free Space,
          LogicalDisk(E:)\% Free Space,
          LogicalDisk(F:)\% Free Space

解析したいパフォーマンスカウンターを追加などしたい場合は、ConfigファイルのColumnsの指定を変更します。

Windowsパフォーマンスログ解析スクリプト

※2024/06/24
仮想環境のWindowsパフォーマンスログのパフォーマンスカウンタが日により変動する事があり、その場合スクリプトが動作しない問題を修正しました。

■analyze_win_performance_log.py

複数のパフォーマンスログファイル全てにおける、各パフォーマンスカウンターの最小値、最大値、平均値、中央値、最小値を記録した日付、最大値を記録した日付を出力します。

import csv
import glob
import os
import datetime
import configparser
import statistics

def analyze_performance_logs(log_directory, output_file, file_name, columns):
    data = []
    header_row = []
    datetimes = []

    for file_path in glob.glob(os.path.join(log_directory, file_name)):
        print(file_path)
        with open(file_path, "r") as file:
            reader = csv.reader(file)
            header_row = next(reader)  # 現在のファイルのヘッダー行

            column_indices = [i for i, header in enumerate(header_row) if any(column in header for column in columns)]

            if not column_indices:
                raise ValueError("No columns found in the header")

            for row in reader:
                if len(row) > 0 and not row[0].startswith("("):
                    datetimes.append(datetime.datetime.strptime(row[0], "%m/%d/%Y %H:%M:%S.%f"))
                    row_data = [row[i] for i in column_indices]
                    data.append(row_data)

    results = []
    results.append(["Column", "Max", "Max DateTime", "Min", "Min DateTime", "Mean", "Median"])

    for i, column_index in enumerate(column_indices):
        values = []
        datetimes_adjusted = []
        for j, row in enumerate(data):
            value = row[i].strip()
            if value.strip() != "":
                values.append(float(value))
                datetimes_adjusted.append(datetimes[j])

        if values:
            max_value = max(values)
            min_value = min(values)
            mean_value = statistics.mean(values)
            median_value = statistics.median(values)

            max_index = values.index(max_value)
            min_index = values.index(min_value) 

            max_datetime = datetimes_adjusted[max_index]
            min_datetime = datetimes_adjusted[min_index]

            results.append([header_row[column_index], max_value, max_datetime, min_value, min_datetime, mean_value, median_value])

    with open(output_file, "w", newline="") as file:
        writer = csv.writer(file)
        writer.writerows(results)

    print(f'解析結果は {output_file} に保存されました。')

if __name__ == "__main__":
    config = configparser.ConfigParser(interpolation=None)
    config_file = 'C:/performance/scripts/config/config.ini'
    config.read(config_file)

    directories = [directory.strip() for directory in config["PerformanceMonitor"]["Directories"].split(",")]
    output_files = [file.strip() for file in config["PerformanceMonitor"]["OutputFiles"].split(",")]

    if len(directories) != len(output_files):
        raise ValueError("Number of log directories and output files doesn't match")

    file_name = config["PerformanceMonitor"]["FileName"]
    columns = [column.strip() for column in config["PerformanceMonitor"]["Columns"].split(",")]

    for i in range(len(directories)):
        print(f'ディレクトリ {directories[i]} を確認中。')
        analyze_performance_logs(directories[i], output_files[i], file_name, columns)

analyze_win_performance_log.pyの実行結果例

analyze_win_performance_log.pyを実行すると以下のような結果が出力されます。
※あるサーバーの11ファイルあるWindowsパフォーマンスログ全体を、パフォーマンスカウンター毎に解析した結果(perfmon_result_systemA_ap#1.csv)です。

■analyze_win_performance_log_per_day.py

日毎(パフォーマンスログファイル毎)に、各パフォーマンスカウンタの最大値、最小値、平均値を出力します。

import csv
import glob
import os
import datetime
import configparser
import re
import statistics

def analyze_performance_logs(log_directory, output_file, file_name, columns):
    header_row = None
    results_per_file = []

    for file_path in glob.glob(os.path.join(log_directory, file_name)):
        print(file_path)
        with open(file_path, "r") as file:
            reader = csv.reader(file)
            header_row = next(reader)

            column_indices = [i for i, header in enumerate(header_row) if any(column in header for column in columns)]

            if not column_indices:
                raise ValueError("No columns found in the header")

            values_per_file = [[] for _ in range(len(columns))]  # ファイルごとの値を格納するリスト
            datetimes_per_file = []

            for row in reader:
                if len(row) > 0 and not row[0].startswith("("):
                    for i, column_index in enumerate(column_indices):
                        if column_index > len(row):
                            continue

                        value = float(row[column_index]) if row[column_index].strip() else None
                        if value is not None:
                            values_per_file[i].append(value)
                    datetimes_per_file.append(datetime.datetime.strptime(row[0], "%m/%d/%Y %H:%M:%S.%f"))

            max_values_per_file = [max(values) if values else 0 for values in values_per_file]
            min_values_per_file = [min(values) if values else 0 for values in values_per_file]
            mean_values_per_file = [statistics.mean(values) if values else 0 for values in values_per_file]

            max_datetime_per_file = min(datetimes_per_file).strftime("%Y/%m/%d") if datetimes_per_file else ""
            result_per_file = [max_datetime_per_file]

            for i, column_index in enumerate(column_indices):
                result_per_file.append(max_values_per_file[i])
                result_per_file.append(min_values_per_file[i])
                result_per_file.append(mean_values_per_file[i])

            results_per_file.append(result_per_file)

    header = ["Date"]
    for column_index in column_indices:
        performance_counter_name = re.sub(r"\\\\[^\\]+\\", "", header_row[column_index])  # パフォーマンスカウンタ名のみを抽出
        header.append(f"{performance_counter_name}(Max)")
        header.append(f"{performance_counter_name}(Min)")
        header.append(f"{performance_counter_name}(Avg)")

    results = [header] + results_per_file

    with open(output_file, "w", newline="") as file:
        writer = csv.writer(file)
        writer.writerows(results)

    print(f"解析結果は {output_file} に保存されました。")

if __name__ == "__main__":
    config = configparser.ConfigParser(interpolation=None)
    config_file = 'C:/performance/scripts/config/config.ini'
    config.read(config_file)

    directories = [directory.strip() for directory in config["PerformanceMonitor"]["Directories"].split(",")]
    output_files = [file.strip() for file in config["PerformanceMonitor"]["OutputFilesPerDay"].split(",")]

    if len(directories) != len(output_files):
        raise ValueError("Number of log directories and output files doesn't match")

    file_name = config["PerformanceMonitor"]["FileName"]
    columns = [column.strip() for column in config["PerformanceMonitor"]["Columns"].split(",")]

    for directory, output_file in zip(directories, output_files):
        print(f"ディレクトリ {directory} を確認中。")
        analyze_performance_logs(directory, output_file, file_name, columns)

analyze_win_performance_log_per_day.pyの実行結果例

analyze_win_performance_log_per_day.pyを実行すると以下のような結果が出力されます。
※あるサーバーの11ファイルあるWindowsパフォーマンスログを、日毎(ファイル毎)、パフォーマンスカウンター毎に解析した結果(perfmon_result_per_day_systemA_ap#1.csv)です。

スクリプトでのサマリー出力で、1つ1つファイルを目視確認するのに比べ、かなり効率的にWindowsパフォーマンスログの解析ができるようになるのではないかと思います。

おわりに

Windowsパフォーマンスログをスクリプトで解析する方法をまとめてみました。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up