CloudTrailとその他周辺のまとめ

Posted at 2025-02-22

1. 運用/監視の設計構築

AWSの運用監視では、CloudTrail、CloudWatch、Trusted Advisorを活用して、リソースの監視、ログ収集、セキュリティ、コスト管理などを行います。

1.1 CloudTrail – APIコールの監査ログ

AWS CloudTrail はAWSアカウント内のすべてのAPIコールを記録し、セキュリティ監査やコンプライアンスの目的で使用されます。

利用例 – CloudTrailの有効化とS3へのログ保存

import boto3

# CloudTrailクライアントの作成
client = boto3.client('cloudtrail')

# CloudTrailの作成
response = client.create_trail(
    Name='MyTrail',
    S3BucketName='my-cloudtrail-logs-bucket',
    IncludeGlobalServiceEvents=True,
    IsMultiRegionTrail=True
)

# CloudTrailの有効化
client.start_logging(Name='MyTrail')
print("CloudTrail has been enabled and started logging.")

1.2 CloudWatch – メトリクス/ログの監視

AWS CloudWatch はメトリクス、ログ、アラームを管理できるサービスです。

利用例 – CloudWatchロググループの作成とアラーム設定

import boto3

# CloudWatchクライアントの作成
client = boto3.client('cloudwatch')

# アラームの作成（例: CPU使用率が80%を超えた場合）
response = client.put_metric_alarm(
    AlarmName='HighCPUUtilization',
    MetricName='CPUUtilization',
    Namespace='AWS/EC2',
    Statistic='Average',
    Period=300,
    EvaluationPeriods=2,
    Threshold=80.0,
    ComparisonOperator='GreaterThanThreshold',
    Dimensions=[{'Name': 'InstanceId', 'Value': 'i-0abcd1234efgh5678'}],
    AlarmActions=['arn:aws:sns:us-east-1:123456789012:MySNSTopic'],
    TreatMissingData='notBreaching'
)

print("CloudWatch Alarm for High CPU Utilization created.")

1.3 Trusted Advisor – ベストプラクティスチェック

AWS Trusted Advisor はコスト最適化、セキュリティ、フォールトトレランスなどの観点でAWS環境を評価します。

利用例 – Trusted Advisorチェックの取得

import boto3

# Trusted Advisorのサポートクライアント作成
client = boto3.client('support', region_name='us-east-1')

# Trusted Advisorチェックを取得
response = client.describe_trusted_advisor_checks(language='en')

# 例としてセキュリティ関連のチェックIDを取得
security_checks = [check for check in response['checks'] if check['category'] == 'security']

for check in security_checks:
    print(f"Check Name: {check['name']} - ID: {check['id']}")

2. データ基盤の設計構築

AWS Glue, Redshift, Athena を活用したデータ基盤の設計構築により、データ収集、ETL処理、分析を実現できます。

2.1 AWS Glue – ETLジョブの管理

AWS Glue はデータの抽出・変換・ロード（ETL）を自動化するサービスです。

利用例 – Glueジョブの作成

import boto3

# Glueクライアントの作成
client = boto3.client('glue')

# Glueジョブの作成
response = client.create_job(
    Name='my-glue-job',
    Role='AWSGlueServiceRole',
    Command={
        'Name': 'glueetl',
        'ScriptLocation': 's3://my-bucket/scripts/glue_script.py',
        'PythonVersion': '3'
    },
    MaxCapacity=2.0,
    Timeout=30,
)

print(f"Glue Job Created: {response['Name']}")

Glue ETLスクリプト例（glue_script.py）:

import sys
from awsglue.transforms import *
from awsglue.utils import getResolvedOptions
from awsglue.context import GlueContext
from pyspark.context import SparkContext

args = getResolvedOptions(sys.argv, ['JOB_NAME'])
sc = SparkContext()
glueContext = GlueContext(sc)
spark = glueContext.spark_session

# S3からデータを読み込み
datasource = glueContext.create_dynamic_frame.from_options(
    connection_type="s3",
    connection_options={"paths": ["s3://my-bucket/input/"]},
    format="csv",
    format_options={"withHeader": True}
)

# データ変換処理
transformed_data = ApplyMapping.apply(
    frame=datasource,
    mappings=[("id", "string", "id", "string"),
              ("name", "string", "name", "string"),
              ("age", "string", "age", "int")]
)

# 変換後のデータをS3に書き込み
glueContext.write_dynamic_frame.from_options(
    frame=transformed_data,
    connection_type="s3",
    connection_options={"path": "s3://my-bucket/output/"},
    format="parquet"
)

2.2 Amazon Redshift – データウェアハウス

Amazon Redshift はペタバイト規模のデータを処理できる高速なデータウェアハウスサービスです。

利用例 – S3からRedshiftにデータコピー

import psycopg2

# Redshift接続情報
conn = psycopg2.connect(
    dbname='mydb',
    user='myuser',
    password='mypassword',
    host='my-redshift-cluster-endpoint',
    port='5439'
)

cursor = conn.cursor()

# S3からRedshiftへデータCOPY
copy_query = """
COPY my_table
FROM 's3://my-bucket/output/'
IAM_ROLE 'arn:aws:iam::123456789012:role/MyRedshiftRole'
FORMAT AS PARQUET;
"""

cursor.execute(copy_query)
conn.commit()

print("Data copied from S3 to Redshift successfully.")

2.3 Amazon Athena – S3上のデータクエリ

Amazon Athena はS3に保存されたデータを直接クエリできるサーバーレスサービスです。

利用例 – AthenaでS3のデータをクエリ

import boto3

# Athenaクライアントの作成
client = boto3.client('athena')

# クエリの実行
response = client.start_query_execution(
    QueryString="SELECT * FROM my_database.my_table LIMIT 10;",
    QueryExecutionContext={'Database': 'my_database'},
    ResultConfiguration={'OutputLocation': 's3://my-bucket/athena-results/'}
)

query_execution_id = response['QueryExecutionId']
print(f"Query started with execution ID: {query_execution_id}")

まとめ

サービス	役割	コード例
CloudTrail	APIコールの監査	トレイルの作成、S3へのログ保存
CloudWatch	メトリクス/ログの監視	アラーム作成、ログ監視
Trusted Advisor	ベストプラクティス評価	セキュリティ/コスト最適化の推奨事項取得
Glue	ETLジョブ	データ変換/加工処理
Redshift	データウェアハウス	データの集約・分析
Athena	サーバーレスクエリ	S3上のデータ分析

これらを組み合わせることで、監視とデータ基盤を包括的に構築できます。例えば、Glueで加工したデータをRedshiftにロードし、Athenaで分析しつつ、CloudWatchでパフォーマンスを監視するシステムが構築可能です。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up