LoginSignup
6
3

More than 3 years have passed since last update.

AWS ALB のアクセスログをJSON形式に整形する

Last updated at Posted at 2020-03-09

AWS ALB はオプションで、以下のようなアクセスログをS3に出力する。

h2 2020-03-08T23:50:58.701251Z app/xxxxxx-prod-alb/xxxxxxxxx 222.222.222.222:64202 - -1 -1 -1 302 - 1254 224 "GET https://example.com:443/action_store HTTP/2.0" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36 Edge/18.18362" ECDHE-RSA-AES128-GCM-SHA256 TLSv1.2 - "Root=1-xxxxxxx-xxxxxxxxxxxx" "at.m3.com" "arn:aws:acm:ap-northeast-1:xxxxxxxxxx:certificate/xxxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxx" 300 2020-03-08T23:50:58.701000Z "redirect" "https://example.com:443/at/action_store" "-" "-" "-"

本格的にアクセスログ解析するならAthenaなどを使うべき。

だが、単に特定のアクセスログをとりあえず調べたい時もある。そんな時スペース区切りの形式は眼に悪い。そこで、見やすいJSONに変換する。

スペース区切りは、csvの一種であるので、csvモジュールでパースできる。

#!/usr/local/bin/python
# alb_access_log_to_json.py

import fileinput
import json
import csv

# https://docs.aws.amazon.com/ja_jp/elasticloadbalancing/latest/application/load-balancer-access-logs.html#access-log-entry-format
FIELD_KEYS = """
type
timestamp
elb
client:port
target:port
request_processing_time
target_processing_time
response_processing_time
elb_status_code
target_status_code
received_bytes
sent_bytes
request
user_agent
ssl_cipher
ssl_protocol
target_group_arn
trace_id
domain_name
chosen_cert_arn
matched_rule_priority
request_creation_time
actions_executed
redirect_url
error_reason
target:port_list
target_status_code_list
""".split()

reader = csv.reader(fileinput.input(), delimiter=' ', quotechar='"', escapechar='\\')
for fields in reader:
    j = dict(zip(FIELD_KEYS, fields))
    print(json.dumps(j))

実行例:

$ head -1 access_log.txt
h2 2020-03-08T23:50:58.701251Z app/xxxxxx-prod-alb/xxxxxxxxx 222.222.222.222:64202 - -1 -1 -1 302 - 1254 224 "GET https://example.com:443/action_store HTTP/2.0" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36 Edge/18.18362" ECDHE-RSA-AES128-GCM-SHA256 TLSv1.2 - "Root=1-xxxxxxx-xxxxxxxxxxxx" "at.m3.com" "arn:aws:acm:ap-northeast-1:xxxxxxxxxx:certificate/xxxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxx" 300 2020-03-08T23:50:58.701000Z "redirect" "https://example.com:443/at/action_store" "-" "-" "-"

$ cat access_log.txt | python3 alb_access_log_to_json.py | jq .
{
  "type": "h2",
  "timestamp": "2020-03-08T23:50:58.701251Z",
  "elb": "app/xxxxxx-prod-alb/xxxxxxxxx",
  "client:port": "222.222.222.222:64202",
  "target:port": "-",
  "request_processing_time": "-1",
  "target_processing_time": "-1",
  "response_processing_time": "-1",
  "elb_status_code": "302",
  "target_status_code": "-",
  "received_bytes": "1254",
  "sent_bytes": "224",
  "request": "GET https://example.com:443/action_store HTTP/2.0",
  "user_agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36 Edge/18.18362",
  "ssl_cipher": "ECDHE-RSA-AES128-GCM-SHA256",
  "ssl_protocol": "TLSv1.2",
  "target_group_arn": "-",
  "trace_id": "Root=1-xxxxxxx-xxxxxxxxxxxx",
  "domain_name": "at.m3.com",
  "chosen_cert_arn": "arn:aws:acm:ap-northeast-1:xxxxxxxxxx:certificate/xxxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxx",
  "matched_rule_priority": "300",
  "request_creation_time": "2020-03-08T23:50:58.701000Z",
  "actions_executed": "redirect",
  "redirect_url": "https://example.com:443/at/action_store",
  "error_reason": "-",
  "target:port_list": "-",
  "target_status_code_list": "-"
}
6
3
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
6
3