Lambdaのブループリントのうちdatadog-process-rds-metricsを利用して、Lambda関数を作成してみます。
今回は、KMSを使わないこととします。
前提条件
Lambdaへの権限
Lambdaに対してフル権限があること。
AWS CLI
以下のバージョンで動作確認済
- AWS CLI 1.11.28
aws --version
結果(例):
aws-cli/1.11.28 Python/2.7.10 Darwin/15.6.0 botocore/1.4.85
バージョンが古い場合は最新版に更新しましょう。
sudo -H pip install -U awscli
IAM Role
'lambdaBasicExecution'ロールが存在すること。
IAM_ROLE_NAME='lambdaBasicExecution'
aws iam get-role \
--role-name ${IAM_ROLE_NAME}
結果(例):
{
"Role": {
"AssumeRolePolicyDocument": {
"Version": "2012-10-17",
"Statement": [
{
"Action": "sts:AssumeRole",
"Principal": {
"Service": "lambda.amazonaws.com"
},
"Effect": "Allow",
"Sid": ""
}
]
},
"RoleId": "AROAXXXXXXXXXXXXXXXXX",
"CreateDate": "2016-12-18T01:23:45Z",
"RoleName": "lambdaBasicExecution",
"Path": "/",
"Arn": "arn:aws:iam::XXXXXXXXXXXX:role/lambdaBasicExecution"
}
}
IAMロールが存在しない場合、
http://qiita.com/tcsh/items/6353876a5c4fef63b4d8 の手順に従って作成し
てください。
- 準備
=======
0.1. リージョンの決定
export AWS_DEFAULT_REGION='ap-northeast-1'
0.2. 変数の確認
プロファイルが想定のものになっていることを確認します。
aws configure list
結果(例):
Name Value Type Location
---- ----- ---- --------
profile lambdaFull-prjz-mbp13 env AWS_DEFAULT_PROFILE
access_key ****************XXXX shared-credentials-file
secret_key ****************XXXX shared-credentials-file
region ap-northeast-1 env AWS_DEFAULT_REGION
- 事前作業
===========
1.1. IAM RoleのARN取得
IAM_ROLE_ARN=$( \
aws iam get-role \
--role-name ${IAM_ROLE_NAME} \
--query 'Role.Arn' \
--output text \
) \
&& echo ${IAM_ROLE_ARN}
結果(例):
arn:aws:iam::XXXXXXXXXXXX:role/lambdaBasicExecution
1.2. DatadogのAPIキーの設定
https://app.datadoghq.com/account/settings#api にアクセスします。
DD_API_KEY='<API KeysのKeyの値>'
New application keyに'lambda'と入力し、Create Application Keyボタンをクリックします。
Hashの値を変数に取り込みます。
DD_APP_KEY='<Application KeyのHashの値>'
1.3. Lambda関数名の決定
LAMBDA_FUNC_NAME="datadog_process_rds_metrics-$( date '+%Y%m%d' )" \
&& echo ${LAMBDA_FUNC_NAME}
同名のLambda関数の不存在確認
aws lambda get-function \
--function-name ${LAMBDA_FUNC_NAME}
結果(例):
A client error (ResourceNotFoundException) occurred when calling the GetFunction operation: Function not found: arn:aws:lambda:ap-northeast-1:XXXXXXXXXXXX:function:datadog_process_rds_metrics-20161219
1.4. Lambda関数
FILE_LAMBDA_FUNC="${LAMBDA_FUNC_NAME}.py"
PY_FUNC_NAME='lambda_handler'
cat << ETX
FILE_LAMBDA_FUNC: ${FILE_LAMBDA_FUNC}
PY_FUNC_NAME: ${PY_FUNC_NAME}
DD_API_KEY: ${DD_API_KEY}
DD_APP_KEY: ${DD_APP_KEY}
ETX
cat << EOF > ${FILE_LAMBDA_FUNC}
from __future__ import print_function
import os
import gzip
import json
import re
import time
import urllib
import urllib2
from base64 import b64decode
from StringIO import StringIO
import boto3
# retrieve datadog options from KMS
#KMS_ENCRYPTED_KEYS = os.environ['kmsEncryptedKeys']
#kms = boto3.client('kms')
datadog_keys = json.loads('{"api_key":"${DD_API_KEY}", "app_key":"${DD_APP_KEY}"}')
print('INFO Lambda function initialized, ready to send metrics')
def _process_rds_enhanced_monitoring_message(ts, message, account, region):
instance_id = message['instanceID']
host_id = message['instanceResourceID']
tags = [
'dbinstanceidentifier:%s' % instance_id,
'aws_account:%s' % account,
'engine:%s' % message["engine"],
]
# metrics generation
uptime = 0
uptime_msg = re.split(' days?, ', message['uptime'])
if len(uptime_msg) == 2:
uptime += 24 * 3600 * int(uptime_msg[0])
uptime_day = uptime_msg[-1].split(':')
uptime += 3600 * int(uptime_day[0])
uptime += 60 * int(uptime_day[1])
uptime += int(uptime_day[2])
stats.gauge('aws.rds.uptime', uptime, timestamp=ts, tags=tags, host=host_id)
stats.gauge('aws.rds.virtual_cpus', message['numVCPUs'], timestamp=ts, tags=tags, host=host_id)
stats.gauge('aws.rds.load.1', message['loadAverageMinute']['one'], timestamp=ts, tags=tags, host=host_id)
stats.gauge('aws.rds.load.5', message['loadAverageMinute']['five'], timestamp=ts, tags=tags, host=host_id)
stats.gauge('aws.rds.load.15', message['loadAverageMinute']['fifteen'], timestamp=ts, tags=tags, host=host_id)
for namespace in ['cpuUtilization', 'memory', 'tasks', 'swap']:
for key, value in message[namespace].iteritems():
stats.gauge('aws.rds.%s.%s' % (namespace.lower(), key), value, timestamp=ts, tags=tags, host=host_id)
for network_stats in message['network']:
network_tag = ['interface:%s' % network_stats.pop('interface')]
for key, value in network_stats.iteritems():
stats.gauge('aws.rds.network.%s' % key, value, timestamp=ts, tags=tags + network_tag, host=host_id)
disk_stats = message['diskIO'][0] # we never expect to have more than one disk
for key, value in disk_stats.iteritems():
stats.gauge('aws.rds.diskio.%s' % key, value, timestamp=ts, tags=tags, host=host_id)
for fs_stats in message['fileSys']:
fs_tag = [
'name:%s' % fs_stats.pop('name'),
'mountPoint:%s' % fs_stats.pop('mountPoint')
]
for key, value in fs_stats.iteritems():
stats.gauge('aws.rds.filesystem.%s' % key, value, timestamp=ts, tags=tags + fs_tag, host=host_id)
for process_stats in message['processList']:
process_tag = [
'name:%s' % process_stats.pop('name'),
'id:%s' % process_stats.pop('id')
]
for key, value in process_stats.iteritems():
stats.gauge('aws.rds.process.%s' % key, value, timestamp=ts, tags=tags + process_tag, host=host_id)
def ${PY_FUNC_NAME}(event, context):
''' Process a RDS enhenced monitoring DATA_MESSAGE,
coming from CLOUDWATCH LOGS
'''
# event is a dict containing a base64 string gzipped
event = json.loads(gzip.GzipFile(fileobj=StringIO(event['awslogs']['data'].decode('base64'))).read())
account = event['owner']
region = context.invoked_function_arn.split(':', 4)[3]
log_events = event['logEvents']
for log_event in log_events:
message = json.loads(log_event['message'])
ts = log_event['timestamp'] / 1000
_process_rds_enhanced_monitoring_message(ts, message, account, region)
stats.flush()
return {'Status': 'OK'}
# Helpers to send data to Datadog, inspired from https://github.com/DataDog/datadogpy
class Stats(object):
def __init__(self):
self.series = []
def gauge(self, metric, value, timestamp=None, tags=None, host=None):
base_dict = {
'metric': metric,
'points': [(int(timestamp or time.time()), value)],
'type': 'gauge',
'tags': tags,
}
if host:
base_dict.update({'host': host})
self.series.append(base_dict)
def flush(self):
metrics_dict = {
'series': self.series,
}
self.series = []
creds = urllib.urlencode(datadog_keys)
data = json.dumps(metrics_dict)
url = '%s?%s' % (datadog_keys.get('api_host', 'https://app.datadoghq.com/api/v1/series'), creds)
req = urllib2.Request(url, data, {'Content-Type': 'application/json'})
response = urllib2.urlopen(req)
print('INFO Submitted data with status {}'.format(response.getcode()))
stats = Stats()
EOF
cat ${FILE_LAMBDA_FUNC}
zip ${LAMBDA_FUNC_NAME}.zip ${FILE_LAMBDA_FUNC}
結果(例):
adding: datadog_process_rds_metrics-20161219.py (deflated 43%)
- Lambda関数の作成
===================
2.1. Lambda関数の作成
LAMBDA_FUNC_DESC='Pushes RDS Enhanced metrics to Datadog.'
LAMBDA_RUNTIME='python2.7'
LAMBDA_HANDLER="${LAMBDA_FUNC_NAME}.${PY_FUNC_NAME}"
FILE_LAMBDA_ZIP="${LAMBDA_FUNC_NAME}.zip"
cat << ETX
LAMBDA_FUNC_NAME: ${LAMBDA_FUNC_NAME}
LAMBDA_FUNC_DESC: "${LAMBDA_FUNC_DESC}"
LAMBDA_RUNTIME: ${LAMBDA_RUNTIME}
FILE_LAMBDA_ZIP ${FILE_LAMBDA_ZIP}
IAM_ROLE_ARN: ${IAM_ROLE_ARN}
LAMBDA_HANDLER: ${LAMBDA_HANDLER}
ETX
aws lambda create-function \
--function-name ${LAMBDA_FUNC_NAME} \
--description "${LAMBDA_FUNC_DESC}" \
--zip-file fileb://${FILE_LAMBDA_ZIP} \
--runtime ${LAMBDA_RUNTIME} \
--role ${IAM_ROLE_ARN} \
--handler ${LAMBDA_HANDLER}
結果(例):
{
"CodeSha256": "lKbgNPMuV0D2blwwCSWwKLwlTrzoPAsFAdB6/FxJ+Q4=",
"FunctionName": "datadog_process_rds_metrics-20161219",
"CodeSize": 1962,
"MemorySize": 128,
"FunctionArn": "arn:aws:lambda:ap-northeast-1:XXXXXXXXXXXX:function:datadog_process_rds_metrics-20161219",
"Version": "$LATEST",
"Role": "arn:aws:iam::XXXXXXXXXXXX:role/lambdaBasicExecution",
"Timeout": 3,
"LastModified": "2016-12-18T01:23:45.678+0000",
"Handler": "datadog_process_rds_metrics-20161219.lambda_handler",
"Runtime": "python2.7",
"Description": "Pushes RDS Enhanced metrics to Datadog."
}
aws lambda get-function \
--function-name ${LAMBDA_FUNC_NAME}
結果(例):
{
"Code": {
"RepositoryType": "S3",
"Location": "https://awslambda-ap-ne-1-tasks.s3-ap-northeast-1.amazonaws.com/snapshots/XXXXXXXXXXXX/HelloWorld-2979ba79-b08f-495d-9ee6-46397c95ba13?x-amz-security-token=AQoDYXdzEDoa8AMR6t8h66eOXhN3%2Fx7XpuRxvf7pVn7IuWV4cEmwx0CtZT6yxCJ1%2BWmigYXqGoyQHuBYOWnxbhmwEcTg839qMuhSu1fk0fXpXf0oJOLkhKMudNqhdElyFQpzyT6Q8GDfhAsfbX9wvwCDTty4imxz7MczF%2FQl6tgvTYdip08ap5fAyrknZGV1%2B1Ggnp5w6JOjydYxuUsWwhoxoEWzi7SoVTmpRQQA91c4VW9lNotOAHACFxo6klzDPM8mxR9RJl66WxFugL0wQJyLUpmtjS9XoArD86sEWWiIccMpV2BQipTPQlzL%2F1Hoy%2BDF6QUxyPUihlDjPBoJTISTP8W1wxmzW%2BLbilAfFQRPY7CFjzR0k%2FA%2FIX5x9iyz52Pu1Q0ASTw1l%2Fq%2Fo3pRbvzWR79QS%2BpxXrwbYzoQHKiK62DSTsQo5tqKPsiDCYzrPxbq8lm7pNBPG%2FsxjePRWBVJeRl08WxEjSjoRRwBOPX5mz1BCUoUBPGG5tEENp87A%2FCdDgibFWM5DdYhwtaYPY7FTmi8DvqjQHL9jOmP8YuVteBTBcv8nFW6UbErPjwwn79FKG1u5M9HoTWUqUMBByz6D4tTRSEw6iJU7XdCujFnhnHe5V8imZ1KGI7fDWpciJhrhml0wnKPCK%2Fe9lK1P2kO7ldSWc7zn5hcIOD2tbEF&AWSAccessKeyId=ASIAJFVALOKV5SJVYPPA&Expires=1445825978&Signature=bvwu1Ny34LgTmZeOO3q4sn7x3Fg%3D"
},
"Configuration": {
"Version": "$LATEST",
"CodeSha256": "lKbgNPMuV0D2blwwCSWwKLwlTrzoPAsFAdB6/FxJ+Q4=",
"FunctionName": "datadog_process_rds_metrics-20161219",
"MemorySize": 128,
"CodeSize": 350,
"FunctionArn": "arn:aws:lambda:ap-northeast-1:XXXXXXXXXXXX:function:datadog_process_rds_metrics-20161219",
"Handler": "datadog_process_rds_metrics-20161219.lambda_handler",
"Role": "arn:aws:iam::XXXXXXXXXXXX:role/lambdaBasicExecution",
"Timeout": 3,
"LastModified": "2016-12-18T01:23:45.678+0000",
"Runtime": "python2.7",
"Description": "Pushes RDS Enhanced metrics to Datadog."
}
}
2.2. Lambda関数の更新
デフォルトの3秒ではタイムアウトする可能性が高いので、ここでは30秒に変更します。
LAMBDA_TIMEOUT='30'
cat << ETX
LAMBDA_FUNC_NAME: ${LAMBDA_FUNC_NAME}
LAMBDA_TIMEOUT: ${LAMBDA_TIMEOUT}
ETX
aws lambda update-function-configuration \
--function-name ${LAMBDA_FUNC_NAME} \
--timeout "${LAMBDA_TIMEOUT}"
結果(例):
{
"CodeSha256": "lKbgNPMuV0D2blwwCSWwKLwlTrzoPAsFAdB6/FxJ+Q4=",
"FunctionName": "datadog_process_rds_metrics-20161219",
"VpcConfig": {
"SubnetIds": [],
"SecurityGroupIds": []
},
"CodeSize": 350,
"MemorySize": 128,
"FunctionArn": "arn:aws:lambda:ap-northeast-1:XXXXXXXXXXXX:function:datadog_process_rds_metrics-20161219",
"Version": "$LATEST",
"Role": "arn:aws:iam::XXXXXXXXXXXX:role/lambdaBasicExecution",
"Timeout": 30,
"LastModified": "2016-12-18T01:23:45.678+0000",
"Handler": "datadog_process_rds_metrics-20161219.handler",
"Runtime": "python2.7",
"Description": "Pushes RDS Enhanced metrics to Datadog."
}
- Lambda関数の動作確認
=======================
3.1. サンプルデータの作成
FILE_INPUT="${LAMBDA_FUNC_NAME}-log-data.json" \
&& echo ${FILE_INPUT}
cat << EOF > ${FILE_INPUT}
{
"messageType":"DATA_MESSAGE",
"owner":"123456789123",
"logGroup":"testLogGroup",
"logStream":"testLogStream",
"subscriptionFilters":[
"testFilter"
],
"logEvents":[
{
"id":"eventId1",
"timestamp":1440442987000,
"message": "{\"engine\":\"Postgres\",\"instanceID\":\"postgresql-redmine-20161211\",\"instanceResourceID\":\"db-7ZOMGTEKHCZNLIFRXB3TOTR2XQ\",\"timestamp\":\"2016-12-13T06:11:44Z\",\"version\":1.00,\"uptime\":\"2 days, 0:40:25\",\"numVCPUs\":1,\"cpuUtilization\":{\"guest\":0.00,\"irq\":0.00,\"system\":0.27,\"wait\":0.20,\"idle\":98.80,\"user\":0.67,\"total\":1.21,\"steal\":0.00,\"nice\":0.07},\"loadAverageMinute\":{\"fifteen\":0.05,\"five\":0.01,\"one\":0.00},\"memory\":{\"writeback\":12,\"hugePagesFree\":0,\"hugePagesRsvd\":0,\"hugePagesSurp\":0,\"cached\":591812,\"hugePagesSize\":2048,\"free\":103168,\"hugePagesTotal\":0,\"inactive\":388232,\"pageTables\":4740,\"dirty\":164,\"mapped\":33312,\"active\":428844,\"total\":1020188,\"slab\":44440,\"buffers\":56164},\"tasks\":{\"sleeping\":146,\"zombie\":0,\"running\":4,\"stopped\":0,\"total\":150,\"blocked\":0},\"swap\":{\"cached\":0,\"total\":4095996,\"free\":4095928},\"network\":[{\"interface\":\"eth0\",\"rx\":451.53,\"tx\":3785.40}],\"diskIO\":[{\"writeKbPS\":16.80,\"readIOsPS\":0.00,\"await\":3.87,\"readKbPS\":0.00,\"rrqmPS\":0.00,\"util\":0.08,\"avgQueueLen\":0.24,\"tps\":4.20,\"readKb\":0,\"device\":\"rdsdev\",\"writeKb\":252,\"avgReqSz\":4.00,\"wrqmPS\":0.00,\"writeIOsPS\":4.20}],\"fileSys\":[{\"used\":625804,\"name\":\"rdsfilesys\",\"usedFiles\":1910,\"usedFilePercent\":0.58,\"maxFiles\":327040,\"mountPoint\":\"/rdsdbdata\",\"total\":5017092,\"usedPercent\":12.47}],\"processList\":[{\"vss\":407876,\"name\":\"postgres: pgadmin redmine 172.18.16.8(35898) idle\",\"tgid\":3097,\"parentID\":3320,\"memoryUsedPc\":1.44,\"cpuUsedPc\":0.00,\"id\":3097,\"rss\":14740},{\"vss\":68748,\"name\":\"postgres: logger process \",\"tgid\":3321,\"parentID\":3320,\"memoryUsedPc\":0.16,\"cpuUsedPc\":0.00,\"id\":3321,\"rss\":1660},{\"vss\":289936,\"name\":\"postgres: checkpointer process \",\"tgid\":3323,\"parentID\":3320,\"memoryUsedPc\":1.43,\"cpuUsedPc\":0.00,\"id\":3323,\"rss\":14636},{\"vss\":289936,\"name\":\"postgres: writer process \",\"tgid\":3324,\"parentID\":3320,\"memoryUsedPc\":0.51,\"cpuUsedPc\":0.00,\"id\":3324,\"rss\":5216},{\"vss\":289936,\"name\":\"postgres: wal writer process \",\"tgid\":3325,\"parentID\":3320,\"memoryUsedPc\":0.79,\"cpuUsedPc\":0.00,\"id\":3325,\"rss\":8100},{\"vss\":289936,\"name\":\"postgres: autovacuum launcher process \",\"tgid\":3326,\"parentID\":3320,\"memoryUsedPc\":0.27,\"cpuUsedPc\":0.00,\"id\":3326,\"rss\":2784},{\"vss\":68744,\"name\":\"postgres: archiver process last was 00000001000000020000002B\",\"tgid\":3327,\"parentID\":3320,\"memoryUsedPc\":0.16,\"cpuUsedPc\":0.00,\"id\":3327,\"rss\":1672},{\"vss\":68744,\"name\":\"postgres: stats collector process \",\"tgid\":3328,\"parentID\":3320,\"memoryUsedPc\":0.19,\"cpuUsedPc\":0.00,\"id\":3328,\"rss\":1968},{\"vss\":399712,\"name\":\"postgres: pgadmin redmine 172.18.16.8(36634) idle\",\"tgid\":6552,\"parentID\":3320,\"memoryUsedPc\":0.89,\"cpuUsedPc\":0.00,\"id\":6552,\"rss\":9128},{\"vss\":393516,\"name\":\"postgres: rdsadmin rdsadmin localhost(63217) idle\",\"tgid\":27304,\"parentID\":3320,\"memoryUsedPc\":0.77,\"cpuUsedPc\":0.00,\"id\":27304,\"rss\":7832},{\"vss\":289936,\"name\":\"postgres\",\"tgid\":3320,\"parentID\":1,\"memoryUsedPc\":1.78,\"cpuUsedPc\":0.00,\"id\":3320,\"rss\":18140},{\"vss\":657332,\"name\":\"OS processes\",\"tgid\":0,\"parentID\":0,\"memoryUsedPc\":2.22,\"cpuUsedPc\":0.00,\"id\":0,\"rss\":22472},{\"vss\":887200,\"name\":\"RDS processes\",\"tgid\":0,\"parentID\":0,\"memoryUsedPc\":15.71,\"cpuUsedPc\":0.07,\"id\":0,\"rss\":160176}]}"
}
]
}
EOF
cat ${FILE_INPUT}
JSONファイルを作成したら、フォーマットが壊れてないか必ず確認します。
jsonlint -q ${FILE_INPUT}
エラーが出力されなければOKです。
gzip ${FILE_INPUT}
STR_DATA=$( cat ${FILE_INPUT}.gz | base64 ) \
&& echo ${STR_DATA}
FILE_INPUT="${LAMBDA_FUNC_NAME}-data.json" \
&& echo ${FILE_INPUT}
cat << EOF > ${FILE_INPUT}
{
"awslogs": {
"data": "${STR_DATA}"
}
}
EOF
cat ${FILE_INPUT}
JSONファイルを作成したら、フォーマットが壊れてないか必ず確認します。
jsonlint -q ${FILE_INPUT}
エラーが出力されなければOKです。
3.2. lambda関数の手動実行
FILE_OUTPUT_LAMBDA="${LAMBDA_FUNC_NAME}-out.txt"
FILE_LOG_LAMBDA="${LAMBDA_FUNC_NAME}-$(date +%Y%m%d%H%M%S).log"
cat << ETX
LAMBDA_FUNC_NAME: ${LAMBDA_FUNC_NAME}
FILE_INPUT: ${FILE_INPUT}
FILE_OUTPUT_LAMBDA: ${FILE_OUTPUT_LAMBDA}
FILE_LOG_LAMBDA: ${FILE_LOG_LAMBDA}
ETX
aws lambda invoke \
--function-name ${LAMBDA_FUNC_NAME} \
--log-type Tail \
--payload file://${FILE_INPUT} \
${FILE_OUTPUT_LAMBDA} \
> ${FILE_LOG_LAMBDA}
cat ${FILE_LOG_LAMBDA} \
| jp.py 'StatusCode'
結果(例):
200
3.3. lambda関数の実行結果の確認
cat ${FILE_OUTPUT_LAMBDA}
結果(例):
{"Status": "OK"}
3.4. lambda関数のログの確認
cat ${FILE_LOG_LAMBDA} \
| jp.py 'LogResult' \
| sed 's/"//g' \
| base64 --decode
結果(例):
START RequestId: 4620fd3f-c0fb-11e6-be7f-5d539d6c06cd Version: $LATEST
INFO Submitted data with status 202
END RequestId: 4620fd3f-c0fb-11e6-be7f-5d539d6c06cd
REPORT RequestId: 4620fd3f-c0fb-11e6-be7f-5d539d6c06cd Duration: 1019.04 ms Billed Duration: 1100 ms Memory Size: 128 MB Max Memory Used: 31 MB