LoginSignup
7
5

More than 5 years have passed since last update.

CloudTrail のログを Hive で集計する

Posted at

使われてなさそうなアクセスキーを削除したいけど、ほんとに使われてないのか自信ないので、CloudTrail のログを集計しようかなと思って調べたら、Hive の神 SerDe を見つけたのでメモ。このブログに書いてあることをやってるだけ。

hive> CREATE EXTERNAL TABLE IF NOT EXISTS cloud_trail_20150120
  ROW FORMAT SERDE 'com.amazon.emr.hive.serde.CloudTrailLogDeserializer'
  STORED AS INPUTFORMAT 'com.amazon.emr.cloudtrail.CloudTrailInputFormat'
  OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
  LOCATION 's3://cloudtrail/AWSLogs/xxxxxxxxxxxx/CloudTrail/ap-northeast-1/2015/01/20/'
;

hive> DESC cloud_trail_20150120;
OK
eventversion            string                  from deserializer
useridentity            struct<type:string,principalid:string,arn:string,accountid:string,invokedby:string,accesskeyid:string,sessioncontext:struct<attributes:struct<mfaauthenticated:string,creationdate:string>>> from deserializer
eventtime               string                  from deserializer
eventsource             string                  from deserializer
eventname               string                  from deserializer
awsregion               string                  from deserializer
sourceipaddress         string                  from deserializer
useragent               string                  from deserializer
requestid               string                  from deserializer
eventid                 string                  from deserializer
Time taken: 0.047 seconds, Fetched: 10 row(s)

hive> SELECT useridentity.accesskeyid, count(*)
  FROM cloud_trail_20150120
  GROUP BY useridentity.accesskeyid
  ORDER BY useridentity.accesskeyid
;
XXXXXXXXXXXXXXXXXXXX    2
XXXXXXXXXXXXXXXXXXXX    1777

hive> SELECT useragent, count(*)
  FROM cloud_trail_20150120
  GROUP BY useragent
  ORDER BY useragent
;
Amazon CLI/ElasticLoadBalancing 1.0.17.0 API 2012-06-01 2
Boto/2.25.0 Python/2.6.9 Linux/3.4.76-65.111.amzn1.x86_64       1777
...
7
5
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
7
5