ClickHouse管理用クエリ

ClickHouse

Posted at 2019-04-18

ClickHouse管理用クエリ

ClickHouse を使う上で便利に使えそうなクエリのメモです。

MergeTreeのモニタリング

DelayedInserts の監視。こちらが増えている場合、INSERTを頻繁にしすぎです。

SELECT *
FROM system.metrics
WHERE metric = 'DelayedInserts'
FORMAT Vertical

MaxPartCountForPartition の監視。
こちらが多すぎる場合、INSERTにウェイトが入って待たされたり、
限界を超えるとINSERTが失敗します。
デフォルトでは150を超えるとウェイトが入り、300を超えるとエラーになります。
（参考：https://github.com/yandex/ClickHouse/blob/master/dbms/src/Storages/MergeTree/MergeTreeSettings.h）

SELECT *
FROM system.asynchronous_metrics
WHERE metric = 'MaxPartCountForPartition'
FORMAT Vertical

パートが多いテーブル・パーティションの確認。

SELECT
    database,
    partition,
    table,
    count(*) AS part_count
FROM system.parts
WHERE active
GROUP BY
    database,
    partition,
    table
ORDER BY count(*) DESC
LIMIT 100

query_log

コンフィグで log_queries を設定しておくと、クエリの実行結果が
system.query_log テーブルに記録されます。

コンフィグの記載箇所は以下の階層です。

<yandex>
    <profiles>
        <default>
            <log_queries>1</log_queries>
        </default>
    </profiles>
</yandex>

system.query_log は現時点ではまだドキュメントに詳しいことが記載されていません。

typeカラムについては、以下のような仕様です。
クエリの実行速度などを見るときは、type = 2 を確認します。

1 - successful start of query execution
2 - successful end of query execution
3 - exception before start of query execution
4 - exception while query execution

Each query create one or two events in query_log table:

two events with types 1 and 2: usual, successful case;
two events with types 1 and 4;
single event with type 3;

引用元：https://github.com/yandex/ClickHouse/issues/847#issuecomment-306013716

ProfileEvents.Names, ProfileEvents.Values には、プロファイルされた統計情報が入ります。
配列になっているので、この値を利用する場合は ARRAY JOIN すると便利です。

どのような項目があるかは、以下のソースが参考になります。
コメントがない項目についてはソースを読むしかありません。。。

実行回数が多いクエリ

SELECT
    query,
    COUNT(*) AS executes
FROM system.query_log
GROUP BY query
ORDER BY executes DESC
LIMIT 10
FORMAT Vertical

一部分を同一視したい場合は、適当に置き換えて集計します。

例：コメントを除外して同一視する場合はこちら。

SELECT
    replaceRegexpAll(query, '/\\*.*?\\*/', '') AS query2,
    COUNT(*) AS executes
FROM system.query_log
GROUP BY query2
ORDER BY executes DESC
LIMIT 10
FORMAT Vertical

例：数字や文字列っぽいものを同一視。

SELECT
    replaceRegexpAll(replaceRegexpAll(query, '\\d+', '0'), '\'.*?\'', '\'\'') AS query2,
    COUNT(*) AS executes
FROM system.query_log
GROUP BY query2
ORDER BY executes DESC
LIMIT 10
FORMAT Vertical

実行時間が長いもの・IOが多いもの

以下のようなプロファイル項目が利用できます。

Name	説明
RealTimeMicroseconds	Total (wall clock) time spent in processing (queries and other tasks) threads (not that this is a sum).
UserTimeMicroseconds	Total time spent in processing (queries and other tasks) threads executing CPU instructions in user space. This include time CPU pipeline was stalled due to cache misses, branch mispredictions, hyper-threading, etc.
SystemTimeMicroseconds	Total time spent in processing (queries and other tasks) threads executing CPU instructions in OS kernel space. This include time CPU pipeline was stalled due to cache misses, branch mispredictions, hyper-threading, etc.
OSIOWaitMicroseconds	Total time a thread spent waiting for a result of IO operation, from the OS point of view. This is real IO that doesn't include page cache.
OSCPUWaitMicroseconds	Total time a thread was ready for execution but waiting to be scheduled by OS, from the OS point of view.
OSCPUVirtualTimeMicroseconds	CPU time spent seen by OS. Does not include involuntary waits due to virtualization.
OSReadBytes	Number of bytes read from disks or block devices. Doesn't include bytes read from page cache. May include excessive data due to block size, readahead, etc.
OSWriteBytes	Number of bytes written to disks or block devices. Doesn't include bytes that are in page cache dirty pages. May not include data that was written by OS asynchronously.
OSReadChars	Number of bytes read from filesystem, including page cache.
OSWriteChars	Number of bytes written to filesystem, including page cache.

クエリ例。ProfileEvents.Names の条件を書き換えれば、いろいろな集計ができます。

SELECT
    event_time,
    query,
    ProfileEvents.Values
FROM system.query_log
ARRAY JOIN
    ProfileEvents.Names,
    ProfileEvents.Values
WHERE (type = 2) AND (ProfileEvents.Names = 'RealTimeMicroseconds')
ORDER BY ProfileEvents.Values DESC
LIMIT 30
FORMAT Vertical

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up