More than 5 years have passed since last update.

AWS LambdaでEC2の起動停止を実行する

Posted at 2019-07-26

みなさんEC2インスタンスを使用している方はどこでステージング環境の起動停止などしてますか。
何がベストかは環境に依存しますが、とりあえずベストプラクティスかなと思ったので記述しました。

LambdaでAWS上のEC2ステージング環境や本番環境の起動のために、Lambda＋CloudWatch Eventでインスタンスの起動・停止を実行すると便利だと思ったので作成してみました。

前提

今回はステージング環境のEC2インスタンスの起動停止を自動で実行することを前提として話を進めていきます。
ある程度決まった時間に起動停止している方々向けです
今回は起動と停止でLambdaファンクションを分けて作成します (つまり2つ作成します)

ステージング環境を自動起動停止するメリット

金銭的に節約になる (または工数が減る)
毎日AWSコンソールから起動しなくてよくなる
サーバの設定が不要→可用性が上がる
管理が楽になる (インスタンスのcron等で実行している場合)

逆にデメリットはある？

強いて言えば、AWS依存が進む

実施環境環境

AWS Console
Python3.7

ゴールとなる全体像

起動用のLambdaも停止用のLambdaも基本的には上図の流れです。

CloudWatch eventでいつ起動するか、停止するかがトリガーされる
Lambdaが実行され起動または停止の命令をboto3経由で呼び出す
成功か失敗かのログはCloudWatch Logsへ出力される
失敗した場合は、SNSで登録したメールアドレスに失敗通知が届く

前準備

Lambdaにコードを記述する前に、IAMロールとポリシーを作成しておきます。
CloudWatch関連、通知用SNSの設定は後程実施します。

LambdaのIAMロールとIAMポリシーの作成

以下のIAMポリシーを作成して、Lambdaに設定するIAMロールにアタッチしてください。
また、適宜使用用途に合わせてResourceや利用するActionなど変更して利用してください。

lambda-ec2-start-stop

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": [
                "logs:CreateLogStream",
                "logs:PutLogEvents",
                "logs:DescribeLogStreams"
            ],
            "Resource": "arn:aws:logs:*:*:*"
        },
        {
            "Sid": "VisualEditor1",
            "Effect": "Allow",
            "Action": [
                "sns:Publish",
                "ec2:DescribeInstances",
                "ec2:StartInstances",
                "ec2:ModifyInstanceAttribute",
                "ec2:StopInstances",
                "ec2:DescribeInstanceStatus"
            ],
            "Resource": "*"
        },
        {
            "Sid": "VisualEditor2",
            "Effect": "Allow",
            "Action": "logs:CreateLogGroup",
            "Resource": "arn:aws:logs:*:*:*"
        }
    ]
}

AWS SNSの設定

今回起動が失敗した場合の通知メールアドレスを設定します。
トピックとサブスクリプションの設定をしてください。
参考
AWS SNS(Amazon Simple Notification Service)の通知設定をしてみる

Lambdaの作成

実際にLambdaの関数を作成して、必要な権限や設定を実施していきます。

一旦関数を作成

AWSコンソールのLambdaサービスに遷移し、関数の作成を押下すると以下の画面がでてきます。
一から作成のままで次の基本情報の入力に行きます。

以下で適当な関数名を入力して、ランタイムをpython3.7を選択。
アクセス権限の設定箇所で、先ほど作成したIAMロールを選択してください。

CloudWatch Eventの作成

CloudWatchサービスのイベントから作成もできますが、今回はLambdaの画面から作成します。
※Lambdaから作成したほうがわかりやすいです。

トリガーを追加を押下
CloudWatch Eventを選択
新規ルールの作成でもろもろ入力して追加を押下

ここまで追加できればLambda関数は以下のようになっているかと。

CloudWatch ロググループの作成

AWSコンソール上でCloudWatchサービスに遷移し、ログ画面へ。
任意のロググループ名を作成し、ログを出力したいログストリームを作成します。
参考
https://docs.aws.amazon.com/ja_jp/AmazonCloudWatch/latest/logs/Working-with-log-groups-and-streams.html

関数コードの作成

EC2インスタンスを起動・停止するコードを記載します。
gitにもアップロードしているので、そちらからもコピー可能です。
※インスタンス起動用
※インスタンス停止用

以下にも参考までに添付しておきます。

ec2_start.py

import json
import os
import time

import boto3

def lambda_handler(event, context):
    """
    lambda main
    """
    custom_print('[START] Starting Script')

    instance_id = os.environ['INSTANCE_ID']

    # Start the instance
    start_ec2_instances(instance_id)
    custom_print('[FINISH] Finished running script')

    return 0

def start_ec2_instances(instance_id):
    """
    Start all instances and wait until they are started.
    NOTE: the wait method can only wait for one instance at a time
    This script is not expected to start multiple instances at once
    therefore will not loop all instances to wait.
    """
    try:
        custom_print('[INFO] Starting Instance: ' + str(instance_id))
        region = os.environ['AWS_REGION']
        ec2_client = boto3.client('ec2', region_name=region)
        ec2_resource = boto3.resource('ec2').Instance(instance_id)

        status_response = ec2_client.describe_instances(instance_ids=[instance_id])

        if status_response['Reservations'][0]['Instances'][0]['State']['Name'] == "running":
            custom_print('[INFO] Instance is already running: ' + str(instance_id))
        else:
            custom_print('[INFO] Instance was not running so called to start: ' + str(instance_id))
            response = ec2_client.start_instances(instance_ids=[instance_id])
            custom_print(response)
            ec2_resource.wait_until_running()
            custom_print('[INFO] Waiting for Instance to be ready: ' + str(instance_id))
            cont = 1
            total = 0

            while cont:
                status_response = ec2_client.describe_instance_status(instance_ids=[instance_id])
                if(status_response['InstanceStatuses'][0]['InstanceStatus']['Status'] == "ok" and status_response['InstanceStatuses'][0]['SystemStatus']['Status'] == "ok"):
                    cont = 0
                else:
                    time.sleep(10)
                    total += 10
            custom_print('[INFO] Successfully Started Instance: ' + str(instance_id) + ' wait time was roughly: ' + str(total) + 'seconds.')

    except Exception as error:
        custom_print('[ERROR] ' + str(error))
        call_sns(str(error))
        return error


def call_sns(msg):
    """
    Nortify via E-mail if Exception arised.
    """
    topic_arn = os.environ["TOPIC_ARN"]
    subject = os.environ["SUBJECT"]
    client = boto3.client("sns")
    request = {
        'TopicArn': topic_arn,
        'Message': msg,
        'Subject': subject
        }

    response = client.publish(**request)

def custom_print(msg):
    """
    AWS Lambda does not put logs in continous matter.
    If you want to have a continous log, you need to create
    your own log and put it inside that log.
    Also, this will determine is the response is JSON
    and print it in JSON format for easier read.

    Parameters
    msg: str
    """
    # If the message is a json format, print the result in json
    # to make it easier to read.
    if isinstance(msg, str):
        msg = msg
        print(msg)
    else:
        msgjson = json.dumps(msg, sort_keys=True, default=str)
        msg = '[RESPONSE]\n' + msgjson
        print('[RESPONSE] ' + msgjson)
    # Time since EPOCH
    time_stamp_milli = int(round(time.time() * 1000))

    # Initialize
    log_group_name = os.environ['CUSTOM_LOG_GROUP']
    log_stream_name = os.environ['CUSTOM_LOG_STREAM']
    region = os.environ['AWS_REGION']
    log_client = boto3.client('logs', region_name=region)

    # Obtain the response and check if token exists
    log_response = log_client.describe_log_streams(
        logGroupName=log_group_name,
        logStreamNamePrefix=log_stream_name)['logStreams'][0]
    found = 0
    for key in log_response.keys():
        if key == 'uploadSequenceToken':
            found = 1

    # If token does exists, the log already has entry; append to the log with token
    if found:
        upload_token = log_client.describe_log_streams(
            logGroupName=log_group_name,
            logStreamNamePrefix=log_stream_name)['logStreams'][0]['uploadSequenceToken']
        response = log_client.put_log_events(
            logGroupName=log_group_name,
            logStreamName=log_stream_name,
            logEvents=[
                {
                    'timestamp': time_stamp_milli,
                    'message': msg
                }
            ],
            sequenceToken=upload_token
        )
    # This log entry is absolutely new, therefore no need of token
    else:
        response = log_client.put_log_events(
            logGroupName=log_group_name,
            logStreamName=log_stream_name,
            logEvents=[
                {
                    'timestamp': time_stamp_milli,
                    'message': msg
                }
            ]
        )

ec2_stop.py

import json
import os
import time

import boto3

def lambda_handler(event, context):
    """
    lambda main
    """
    custom_print('[START] Starting Script')

    instance_id = os.environ['INSTANCE_ID']

    # Stop the instance
    stop_ec2_instances(instance_id)

    custom_print('[FINISH] Finished running script')

    return 0

def stop_ec2_instances(instance_id):
    """
    Stop all instances and wait until they are stopped.
    NOTE: the wait method can only wait for one instance at a time
    This script is not expected to stop multiple instances at once
    therefore will not loop all instances to wait.
    """
    try:
        region = os.environ['AWS_REGION']
        custom_print('[INFO] Stopping Instance: ' + str(instance_id))
        ec2_client = boto3.client('ec2', region_name=region)
        ec2_resource = boto3.resource('ec2').Instance(instance_id)
        response = ec2_client.stop_instances(instance_ids=[instance_id])
        custom_print(response)
        ec2_resource.wait_until_stopped()
        custom_print('[INFO] Successfully Called to Stop Instance: ' + str(instance_id))

    except Exception as error:
        custom_print('[ERROR] ' + str(error))
        call_sns(str(error))
        return 2

def call_sns(msg):
    """
    Nortify via E-mail if Exception arised.
    """
    topic_arn = os.environ["TOPIC_ARN"]
    subject = os.environ["SUBJECT"]
    client = boto3.client("sns")
    request = {
        'TopicArn': topic_arn,
        'Message': msg,
        'Subject': subject
        }

    response = client.publish(**request)

def custom_print(msg):
    """
    AWS Lambda does not put logs in continous matter.
    If you want to have a continous log, you need to create
    your own log and put it inside that log.
    Also, this will determine is the response is JSON
    and print it in JSON format for easier read.

    Parameters
    msg: str
    """
    # If the message is a json format, print the result in json
    # to make it easier to read.
    if isinstance(msg, str):
        msg = msg
        print(msg)
    else:
        msgjson = json.dumps(msg, sort_keys=True, default=str)
        msg = '[RESPONSE]\n' + msgjson
        print('[RESPONSE] ' + msgjson)
    # Time since EPOCH
    time_stamp_milli = int(round(time.time() * 1000))

    # Initialize
    log_group_name = os.environ['CUSTOM_LOG_GROUP']
    log_stream_name = os.environ['CUSTOM_LOG_STREAM']
    region = os.environ['AWS_REGION']
    log_client = boto3.client('logs', region_name=region)

    # Obtain the response and check if token exists
    log_response = log_client.describe_log_streams(
        logGroupName=log_group_name,
        logStreamNamePrefix=log_stream_name)['logStreams'][0]
    found = 0
    for key in log_response.keys():
        if key == 'uploadSequenceToken':
            found = 1

    # If token does exists, the log already has entry; append to the log with token
    if found:
        upload_token = log_client.describe_log_streams(
            logGroupName=log_group_name,
            logStreamNamePrefix=log_stream_name)['logStreams'][0]['uploadSequenceToken']
        response = log_client.put_log_events(
            logGroupName=log_group_name,
            logStreamName=log_stream_name,
            logEvents=[
                {
                    'timestamp': time_stamp_milli,
                    'message': msg
                }
            ],
            sequenceToken=upload_token
        )
    # This log entry is absolutely new, therefore no need of token
    else:
        response = log_client.put_log_events(
            logGroupName=log_group_name,
            logStreamName=log_stream_name,
            logEvents=[
                {
                    'timestamp': time_stamp_milli,
                    'message': msg
                }
            ]
        )

Lambda上で設定する環境変数は以下を設定してください。

環境変数の説明

CUSTOM_LOG_GROUP: CloudWatchログのグループ名
CUSTOM_LOG_STREAM: CloudWatchログのストリーム名
INSTANCE_ID: EC2インスタンスのID
SUBJECT: SNSで発行するメールの件名
TOPIC_ARN: SNSトピックのARN

その他の設定

EC2インスタンスは起動や停止にやや時間がかかるため、タイムアウトを延ばしておくのが賢明です。
※私の場合4分程度にしています。

まとめ

ご質問やここ違う等の指摘ありましたら是非是非コメントお願いします。
皆さんの工数が減って楽できるようになるのを願っています。

すぐ作れるスクリプトですが、説明する用にまとめると時間がかかりますね。
その他にも便利Lambdaをアップしていくつもりですので、こんなのあったらいいなとかもご要望いただければ幸いです。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up