Posted at

AWS Developer Associate Exam Tips


概要

AWSの経験がほぼない状態で勉強をし始めて、4週間後に受験して合格しました。英語で受験しましたので下記の試験ヒントは全部英語になります。

みんな英語わかりますもんね。


勉強方法

オンライン講座 (https://www.udemy.com/aws-certified-developer-associate/)

ドキュメンテーションの調べ (https://docs.aws.amazon.com/)

テスト試験 (https://www.udemy.com/aws-certified-developer-associate-practice-tests-dva-c01/)

udemyはセールが多いのでテスト試験と講座を1000円づつで買いました。


おまけ

この記事で記載されている試験ヒントは自分が勉強しながら書いておいたものです。


本文

Amazon EBS

https://docs.aws.amazon.com/en_us/AWSEC2/latest/UserGuide/AmazonEBS.html


  • On demand - allows to pay a fixed rate by the hour (or by the second) with no commitment.

  • Reserved - provides with a capacity reservation and offer a significant discount on the hourly charge for an instance. 1 or 3 year terms.

  • Spot - enables to bid whatever price you want for instance capacity, providing for even greater savings if your application have flexible start and end times.

  • Dedicated Hosts - Physical EC2 server dedicated for your use. Dedicated Hosts can help reduce costs by allowing you to use your existing server-bound software licenses.


  • If a spot instance is terminated by Amazon EC2 you will not be charged for a partial hour of usage. However, if you terminate the instance yourself, you will be charged for the complete hour in which the instance ran.


  • 3 types of load balancers: Application, Network, Classic.


  • Application Load Balancers are best suited for load balancing of HTTP and HTTPS traffic. They operate at Layer 7 and are application-aware. They are intelligent, and you can create advanced request routing, sending specified requessts to specific web servers.


  • Network Load Balancers are best suited for load balancing of TCP traffic where extreme performance is required. Operating at the connection level (layer 4). Extreme performance.


  • Classic Load Balancers are the legacy load balancers. Using specific features such as x-forwarded-for and sticky sessions.


Sticky session refers to the feature of many commercial load balancing solutions for web-farms to route the requests for a particular session to the same physical machine that serviced the first request for that session.


  • 504 Error means the gateway has timed out. This means that the application not responding within the idle timeout period.

  • Troubleshoot the application. Is it the Web Server or Database server?

  • If you need the IPv4 address of your end user, look for the X-Forwarded-For header.


  • Route53 is Amazon’s DNS service


  • Allows you to map your domain names to: EC2 Instances, Load Balancers, S3 Buckets.


CLI:

- Always give your users the minimum amount of access required

- Assign your users to groups. Your users will automatically inherit the permissions of the group. The groups permissions are assigned using policy documents.

- You will see secret access key only once. If you do not save it, you can delete the Key Pair (Access Key ID and Secret Access Key) and regenerate it. You will need to run aws configure again.

- Do not create just one access key and share that with all your developers. If someone leaves the company on bad terms, then you will nedd to delete the key and create a new one and every developer would then need to update their keys. Instead create one key pair per developer.

- Do not store secret access key in GitHub as it will be compromised, since there are scripts that analyze code to find access keys.

- You can install the CLI on your Mac, Linux or windows PC to store all files up in the cloud.


  • Roles allow you to not use Access Key ID’s and Secret Access Keys

  • Roles are preferred from a security perspective

  • Roles are controlled by policies

  • You can change a policy on a role and it will take immediate effect

  • You can attach and detach roles to running EC2 instances without having to stop or terminate these instances


  • You can encrypt the root device volume (the volume the OS is installed on) using Operating System level encryption.


  • You can encrypt the root device volume by first taking a snapshot of that volume, and then creating a copy of that snap with encryption. You can then make an AMI of this snap and deploy the encrypted root device volume.


  • You can encrypt additional attached volumes using the console, CLI or API


AWS Database Types:

・RDS - OLTP

- SQL

- MYSQL

- PostgreSQL

- Oracle

- Aurora

- MariaDB


・DynamoDB - No SQL

・Redshift - OLAP

・Elasticache - In Memory Caching

- Memcached

- Redis


  • To troubleshoot connection issues between RDS and EC2 we need to open up a 3306 port on RDS inbound security group and allow connection to security group that contains EC2 instances. Type: Mysql/Aurora.


  • It is possible to take manual snapshot and enable encryption on it.


  • Multi-AZ allows you to have an exact copy of your production database in another Availability Zone. AWS handles the replication for you, so when your production database is written to, this write will automatically be syncronized to the stand-by database. In the event of planned database maintenance, DB instance failure, or an Availability Zone failure, Amazon RDS will automatically failover to the stand-by so that database operations can resume quickly without administrative intervention.


  • Multi-AZ is for Disaster Recovery only. Not for improving performance.


  • Multi-AZ databases: SQL Server, Oracle, MySQL, PostgreSQL, MariaDB, Aurora.


  • Read replicas allow you to have a read-only copy of your production database. This is achieved by using Asynchronous replication from the primary RDS instance to the read replica. You use read replicas primarily for very read-heavy database workloads.


  • Read Replica databases: MySQL, Oracle, PostgreSQL, MariaDB, Aurora


Read Replica Databases:

- Used for scaling, not for DR

- Must have automatic backups turned on in order to deploy a read replica

- Can have up to 5 read replica copies of any database.

- Can have read replicas of read replicas (increases latency)

- Each read replica will have it’s own DNS endpoint

- Can have read replicas that have Multi-AZ

- Can create read replicas of Multi-AZ source databases

- Read replicas can be promoted to be their own databases. This breaks the replication.

- Can have a read replica in a second region.


  • ElastiCache is a web service that improves the performance of web applications by allowing you to retrieve information from fast, managed, in-memory caches, instead of relying entirely on slower disk-based databases.

ElastiCache:

- Memcached

- Redis (Master/Slave replication, Multi-AZ)

Memcached - Use Cases:

- Object caching to, for example, offload database

- Simple caching model

- Running large cache nodes, require multithreaded performace with utilization of multiple cores.

- Ability to scale cache horizontally as it grows.

Redis - Use Cases:

- Usage of advanced data types: lists, hashes, sets.

- Sorting and ranking datasets in memory. Like leaderboards.

- Persistence of key store.

- Usage of AWS Multi-AZ with failover.

- Pub/Sub capabilities


  • Elasticache is a good choice if your database is particularly read-heavy and not prone to frequent changing.


  • Redshift is a good answer if the reason your database is feeling stress is because management keep running OLAP transactions on it, etc.#概要

    AWSの経験がほぼない状態で勉強をし始めて、4週間後に受験して合格しました。英語で受験しましたので下記の試験ヒントは全部英語になります。


  • S3 is Object-based. Allows to upload files.


  • Not suitable to install an operating system or running a database on.


  • Files can be from 0 bytes to 5 TB (5GB with a single PUT operation)


  • There is unlimited storage


  • Files are stored in buckets


  • S3 is a universal namespace. Names must be unique globally.


  • Read after Write consistency for PUTS of new Objects


  • Eventual Consistency for overwrite PUTS and DELETES (can take some time to propagate)


S3 Storage Tiers:

- S3 (durable, immediately available, frequenly accessed)

- S3 IA (durable, immediately available, infrequenty accessed)

- S3 One Zone IA (same as above, data is stored in a single Availability Zone only)

- S3 Reproduced Redundancy Storage (data that is easily reproducible, such as thumbnails)

- Glacier (archived data, where you can wait 3-5 hours before accessing)

Core fundamentals of an S3 Object:

- Key (name)

- Value (data)

- Version ID

- Metadata

- Subresources - bucket-specific configuration:

- Bucket policies, Access Control Lists

- Cross Origin Resource Sharing

- Transfer Acceleration

Successful uploads will generate a HTTP 200 status code.

https://aws.amazon.com/s3/faqs/


  • By default, all newly created buckets are PRIVATE

  • You can set up access control to your buckets using:


    • Bucket Policies - Applied at a bucket level.

    • Access Control List - Applied at an object level.



  • S3 buckets can be configured to create access logs, which log all requests made to the S3 bucket. These logs can be written to another bucket.

Encryption

- In-Transit

- SSL/TLS (HTTPS)

At Rest

- Server Side Encryption

SSE-S3 (AES)

SSE-KMS

SSE-C (Customer provided keys)

- Client Side Encryption

If you want to enforce the use of encryption for your files stored in S3, use an S3 Bucket Policy to deny all PUT requests that don’t include the x-amz-server-side-encryption parameter in the request header.

Cloudfront:

Edge Location - This is the location where content will be cached. This is separate to an AWS Region/AZ

Origin - This is the origin of all the files that the CDN will distribute. Origins can be an S3 bucket, an EC2 instance, an Elastic load balancer or Route53

Distribution - this is the name given the CDN, which consists of a collection of Edge Locations

- Web Distribution: typically used for websites

- RTMP: used for media streaming


  • Edge locations are not just READ only - you can WRITE to them too (PUT an object on to them)

  • CloudFront Edge Locations are utilised by S3 Transfer Acceleration to reduce latency for S3 uploads.

  • Objects are cached for the life of the TTL (Time To Live)

  • You can clear cached objects, but you will be charged.

To restrict viewer access (access to paid content) use signed url or signed cookies

Need to set default root object to be able to see files from s3 via cloudfront

2 main approaches to Performance Optimization for S3:

- GET-Intensive Workloads - Use CloudFront

- Mixed-Workloads - Avoid sequential key names for your S3 objects. Instead, add a random prefix like a hex hash to the key name to prevent multiple objects from being stored on the same partition. (NO LONGER ACTUAL)

Serverless Services:

Lambda

S3

Dynamo

Aurora

API Gateway

SNS

SQS

Step Functions

Kinesis

Lambda

AWS Lambda is a compute service where you can upload your code and create a Lambda function. AWS Lambda takes care of provisioning and managing the servers that you use to run the code. You don’t have to worry about operating systems, patching, scaling, etc. You can use Lambda in thefollowing ways.

- As an event-driven compute service where AWS lambda runs your code in response to events. These events could be changes to data in an Amazon S3 bucket or an Amazon DynamoDB table.

- As a compute service to run your code in response to HTTP requests using Amazon API Gateway or API calls made using AWS SDKs.

Supported languages for Lambda:

Node.js

Java

C#

Python

Ruby

Go

PowerShell


  • Lambda scales out (up) automatically

  • Lambda function are independent, 1 event = 1 function

  • Lambda is serverless

  • Lambda is a compute service

  • Know what services are serverless

  • Lambda functions can thigger other lambda functions, 1 event can = x functions if functions trigger other functions
    -Architectures can get extremely complicated, AWS X-ray allows you to debug what is happening

  • Lambda can do things globally, you can use it to back up S3 buckets to other S3 buckets, etc

  • Know your lambda triggers


  • REST APIs (REpresentation State Transfer) - uses JSON


  • SOAP APIs (Simple Object Access Protocol) - uses XML


API Gateway:

API Gateway is a fully managed service that makes it easy for developers to publish, maintain, monitor and secure APIs at any scale.


  • Remember what API Gateway is at a high level

  • API Gateway has caching capabilites to increase performance

  • API Gateway is low cost and scales automatically

  • You can throttle API Gateway to prevent attacks

  • You can log results to CloudWatch

  • If you are using Javascript/AJAX that uses multiple domains with API Gateway ensure that you have enabled CORS on API Gateway

  • CORS is enforced by the client

Origin policy cannot be read at the remote resource - Need to enable CORS on API Gateway

Lambda triggers:

API Gateway

Application Load Balancer

CloudWatch Events

CloudWatch Log

CodeCommit

Cognito Sync Trigger

DynamoDB

Kinesis

S3

SNS

SQS

SES

CloudFormation

When you use versioning in AWS Lambda, you can publish one or more versions of your Lambda function. As a result, you can work with different variations of your Lambda function in your development workflow, such as development, beta and production.

Each Lambda function version has a unique Amazon Resource Name (ARN). After you publish a version, it is immutable (can't be changed)

AWS Lambda maintains your latest function code in the $LATEST version.

You can refer to this function using its Amazon Resource Name (ARN)

* Qualified ARN - the function ARN with the version suffix. eg :$LATEST

* Unqualified ARN - the function ARN without the version suffix.

It is possible to split the traffic between two versions.

It is impossible to have $LATEST version in split.

Step Functions

Uses JSON-based Amazon States Language


  • Great way to visualize your serverless application

  • Step Functions automatically triggers and tracks each step

  • Step Functions logs the state of each step so if something goes wrong you can track what went wrong and where

AWS X-ray is a service that collects data about requests that your application serves, and provides tools you can use to view, filter and gain insights into that data to identify issues and opportunities for optimization.

The X-Ray SDK provides:

* Interceptors to add to your code to trace incoming HTTP requests

* Client handlers to instrument AWS SDK clients that your application uses to call other AWS services

* An HTTP client to use to instrument calls to other internal and external HTTP web services

The X-Ray integrates with the following AWS services:

* Elastic Load Balancing

* AWS Lambda

* Amazon API Gateway

* Amazon Elastic Compute Cloud

* AWS Elastic Beanstalk

The X-Ray Integrates with the following languages:

* Java

* Go

* Node.js

* Python

* Ruby

* .Net


  • It is possible to import API's using Swagger 2.0 definition files

  • API Gateway can be throttled

  • Default limits are 10000 Requests Per Second or 5000 concurrently

  • You can configure API Gateway as a SOAP Webservice passthrough

Amazon DynamoDB is a low latency NoSQL database

Consists of Tables Items and Attributes

Supports both document and key-value data models

Supported document formats are JSON, HTML, XML

2 types of Primary Key - Partition Key and combination of Partition Key + Sort Key (Composite key)

2 data consistency models: Strongly Consistent / Eventually Consistent

Access is controlled using IAM policies

Fine grained access control using IAM Condition parameter: dynamodb:LeadingKeys to allow users to access only the items where the partition key value matches their user ID.

Indexes enable fast queries on specific data columns

Give you a different view of your data, based on alternative Partition/Sort Keys

Important to understand the differences

Local Secondary Index

Must be created at when you create your table (can’t be modified or removed)

Same Partition Key as your table

Different Sort Key

Global Secondary Index

Can create any time - at table creation or after

Different Partition Key

Different Sort Key

ScanIndexForward parameter is related only to QUERY not SCAN

A Query operation finds items in a table using only the Primary Key attribute.

You provide the Primary Key name and a distinct value to search for.

A Scan operation examines every item in the table.

By default returns all data attributes.

Use the ProjectionExpression parameter to refine the results.

Query results are always sorted by the Sort Key if there is one.

Sorted in ascending order.

Set ScanIndexForward parameter to false to reverse the order-queries only.

Query operation is generally more efficient than a Scan.

Reduce the impact of a query or scan by setting a smaller page size which uses fewer read operations.

Isolate scan operations to specific tables and segregate them from your mission-critical traffic.

Try Parallel scans, rather than the default sequential scan.

Avoid using scan operations if you can: design tables in a way that you can use the Query, Get or BatchGetItem APIs.

Provisioned Throughput is measured in Capacity Units.

1 x Write Capacity Unit = 1 x 1 KB Write per second

1 x Read Capacity Unit = 1 x 4 KB Strongly Consistent Read OR 2 x 4 KB Eventually Consistent Reads per second

On-Demand Capacity:

- Unknows workloads

- Unpredictable application traffic

- You want a Pay-per-use model

- Spiky, short lived peaks

DAX:

- Provides in-memory caching for DynamoDB tables

- Improves response times for Eventually Consistent reads only

- You point your API calls the DAX cluster, instead of your table.

- If the item you are querying is on the cache, DAX will return it; otherwise it will perform an Eventually Consistent GetItem operation to your DynamoDB table.

- Not suitable for write-intensive applications or applications that require Strongly Consistent reads.

DAX supports only write through

Elasticache:

- In-memory cache sits between your application and database

- 2 different caching strategies: Lazy loading and Write Through


  • Lazy Loading only caches the data when it is requested

  • Elasticache Node failures not fatal, just lots of cache misses

  • Cache miss penalty: Initial request, query database, writing to cache

  • Avoid stale data by implementing a TTL (time to live)


  • Write Through strategy writes data into the cache whenever there is a change to the database


  • Data is never stale


  • Write penalty: Each write involves a write to the cache


  • Elasticache node failure means that data is missing until added or updated in the database


  • Wasted resources if most of the data is never used


DynamoDB Transactions:

- ACID Transactions (Atomic (can’t be executed partially), Consistent (doesn’t break database), Isolated (all transactions can be executed in parallel, no dependencies), Durable (after transaction is commited, it is written to the disk, not in memory))

- Read or write multiple items across multiple tables as an all or nothing operation)

- Check for a pre-requisite condition before writing to a table


  • Can set TTL from the DynamoDB interface

DynamoDB Streams:

- Time-ordered sequence of item level modifications in your DynamoDB Tables (insert, update, delete)

- Data is stored for 24 hours only, encrypted at rest

- Accessed using a dedicated point

- By default the Primary Key is recorded

- Before and After images can be captured

- Can be used as an event source for Lambda so you can create applications which take actions based on events in your DynamoDB table

Provisioned Throughput & exponential backoff:

- If you see a ProvisionedThroughputExceeded error, this means the number of requests is too high

- Exponential Backoff improves flow by retrying requests using progressively longer waits (50s…100s…200s)

- This is not just true for DynamoDB, Exponential Backoff is a feature of every AWS SDK and applies to many services within AWS, e.g. S3 buckets, CloudFormation, SES

KMS:

AWS Key Management Service is a managed service that makes it easy for you to create and control the encryption keys used to encrypt your data. AWS KMS is integrated with other AWS services including EBS, S3, Redshift, Elastic Transcoder, WorkMail, RDS and others to make it simple to encrypt your data with encryption keys that you manage.

Keys are region-based, different keys for different regions.

The Customer Master Key (CMK):

- alias

- creation date

- description

- key state

- key material (either customer provided or AWS provided) (KMS or external)

- CAN NEVER BE EXPORTED (Cloud HSM can)


  • Define key Administrative Permissions (who can administrate, not use)

  • Define key Usage Permissions (who can use)

Gotta know what means:

- aws kms encrypt - Encrypts plaintext into ciphertext by using a customer master key (CMK)


  • aws kms decrypt - Decrypts ciphertext. Ciphertext is plaintext that has been previously encrypted by using any of the following operations: Encrypt. (Whenever possible, use key policies to give users permission to call the Decrypt operation on the CMK, instead of IAM policies. Otherwise, you might create an IAM user policy that gives the user Decrypt permission on all CMKs.)


  • aws kms re-encrypt - Encrypts data on the server side with a new customer master key (CMK) without exposing the plaintext of the data on the client side. The data is first decrypted and then reencrypted. You can also use this operation to change the encryption context of a ciphertext.


  • aws kms enable-key-rotation - Enables automatic rotation of the key material for the specified customer master key (CMK). You cannot perform this operation on a CMK in a different AWS account.


Envelope Encryption:

- Customer Master Key used to decrypt the data key (envelope key)

- Envelope Key is used to decrypt the data

SQS (Simple Queue Service):

- Is a distributed message queueing system

- Allows you to decouple the components of an application so that they are independent

- Pull-based, not push-based

- Standard Queues (default) - best-effort ordering; message delivered at least once

- FIFO Queues (First-in-first-out) - ordering strictly preserved, message delivered once, no duplicates. e.g. good for banking transactions which need to happen in strict order.

- Visibility Timeout

- Default is 30 seconds - increase if your task takes >30 seconds to complete

- Max 12 hours

- Short Polling - returned immediately even if no messages are in the queue

- Long Polling - polls the queue periodically and only returns a response when a message in the queue or the timeout is reached (can save money)

SNS (Simple Notification Service):

- SNS is a scalable and highly available notification service which allows you to send push notifications from the cloud

- Variety of message formats supported: SMS text message, email, SQS queues, any HTTP endpoint

- Pub-sub model whereby users subscribe to topics

- It is a push mechanism not pull (poll) mechanism


  • SES is for email only

  • It can be used for incoming and outgoing mail

  • It is not subscription based, you only need to know the email address

ElasticBeanstalk:

- Deploys and scales your web applications including the web application server platform where required

- Supports widely used programming technologies - Java, PHP, Python, Ruby, Go, Docker, .NET, Node.js

- And application server platforms like Tomcat, Passenger, Puma and IIS

- Provisions the underlying resources for you

- Can fully manage the EC2 instances for you or you can take full administrative control

- Updates, monitoring, metrics and healt checks all included

4 types of update on Elastic Beanstalk

All at Once:

- Deploys the new version to all instances simultaneously

- All of your instances are out of service while the deployment takes place

- You will experience an outage while the deployment is taking place - not ideal for mission-critical production systems

- If the update fails, you need to roll back the changes by re-deploying the original version to all your instances

Rolling Deployment:

- Deploys the new version in batches

- Each batch of instances is taken out of service while the deployment takes place

- Your environment capacity will be reduced by the number of instances in a batch while the deployment takes place

- Not ideal for performance sensitive systems

- If the update fails, you need to perform an additional rolling update to roll back the changes

Rolling With Additional Batch:

- Launches an additional batch of instances

- Deploys the new version in batches

- Maintains full capacity during the deployment process

- If the update fails, you need to perform an additional rolling update to roll back the changes

Immutable deployment:

- Deploys the new version to a fresh group of instances in their own new autoscaling group

- When the new instances pass their health checks, they are moved to your existing auto scaling group; and finally, the old instances are terminated

- Maintains full capacity during the deployment process

- The impact of a failed update is far less, and the rollback process required only terminating the new auto scaling group

- Preferred option for Mission Critical production systems


  • You can customize your Elastic Beanstalk environment by adding configuration files

  • The files are written in YAML or JSON

  • Files have a .config extension (name is any of choice)

  • The .config files are saved to the .ebextensions folder

  • Your .ebextensions folder must be located in the top level directory of your application source code bundle

RDS & EB:

Launch with Elastic Beanstalk:

- When you terminate the Elastic Beanstalk environment, the database will also be terminated

- Quick and easy to add your database and get started

- Suitable for Dev and Test environments only

Launch outside of Elastic Beanstalk:

- Additional configuration steps required - Security Group and Connection information

- Suitable for Production environments, more flexibility

- Allows connection from multiple environments, you can tear down the application stack without impacting the database

Streaming Data is data that is generated continiously by thousands of data sources which is typically send in the data records simultaneously and in small sizes (order of kilobytes) (e.g. geodata, game data)

Amazon Kinesis is a platform to send your streaming data too. Kinesis makes it easy to load and analyze streaming data and also providing the ability for you to build your onw custom applications for yor business needs.


  • Know the difference between Kinesis Streams and Kinesis Firehose. You will be given scenario questions and you must choose the most relevant service

  • Understand what Kinesis analytics is.

Kinesis Streams consist of shards. 5 transactions per second for reads, up to a maximum total data read rate of 2MB per second and up to 1000 records per second for writes, up to a maximum total data write rate of 1 MB per second (including partition keys). The total capacity of the stream is the sum of the capacities of its shards.

Kinesis Firehose is either analyzed by lambda or stored in s3 or elasticsearch cluster. Basically it stores a data. No shards, all is automated. No need to worry about data consumers.

Kinesis Analytics is used to analyze data in Kinesis Streams or Kinesis Firehose with SQL queries.

CI/CD:

- Continious delivery means manual release

- Continious deployment means automated release

https://d1.awsstatic.com/whitepapers/DevOps/practicing-continuous-integration-continuous-delivery-on-AWS.pdf

- Continious Integration is about integrating or merging the code changes frequently at least once per day, enables multiple devs to work on the same application

- Continious Delivery is all about automating the build, test and deployment functions.

- Continious Deployment fully automates the entire release process, code is deployed into Production as soon as it has successfully passed through the release pipeline.

- Aws CodeCommit - Source Control service (git)

- Aws CodeBuild - compile source code, run tests and package code

- Aws CodeDeploy - Automated Deployment to EC2, on premises systems and Lambda

- Aws CodePipeline - CI/CD workflow tool, fully automates the entire release process (build, test, deployment)

AWS CodeCommit:

- Based on Git

- Centralized repository for all your code, binaries, images and libraries

- Tracks and manages code changes

- Maintains version history

- Manages updated from multiple sources and enables collaboration

AWS CodeDeploy:

- AWS CodeDeploy is a fully managed automated deployment service and can be used as part of a Continious Delivery or Continious Deployment process.

- In-Place or Rolling update - you stop the application on each host and deploy the latest code. EC2 and on premise systems only (not Lambda). To roll back you must re-deploy the previous version of the application.

- Blue/Green - New instances are provisioned and the new application is deployed to these new instances. Traffic is routed to the new instances according to your own schedule. Supported for EC2, on-premise systems and Lambda functions. Roll back is easy, just route the traffic back to the original instances. Blue is the active deployment, green is the new release.

AWS CodePipeline:

- Continious Integration / Continious Delivery service

- Automates your end-to-end software release process based on a user defined workflow

- Can be configured to automatically trigger your pipeline as soon as a change is detected in your source code repository

- Integrated with other services from AWS like CodeBuild and CodeDeploy, as well as third party and custom plug-ins.


  • The AppSpec file defines all the parameters needed for the deployment e.g. location of application files and pre/post deployment validation tests to run

  • For EC2/ On Premises systems, the appspec.yml file must be placed in the root directory of your revision (the same folder that contains your application code). Written in YAML.

  • Lambda supports YAML or JSON

Docker and CodeBuild:

- Docker Commands to build, tag (apply an alias) and push your Docker image to the ECR repository

docker build -t {repository-name} .

docker tag {tag-name} {repository-link}

docker push {repository-link}

- Use buildspec.yml to define the build commands and settings used by CodeBuild to run your build

- You can override the settings in buildspec.yml by adding your own commands in the console when you launch the build

- If your build fails, check the build logs in the CodeBuild console and you can also view the full CodeBuild log in CloudWatch

CloudFormation:

- Allows you to manage, configure and provision AWS infrastructure as code. YAML or JSON.

- Remember the main sections in the Cloud Formation Template:

- Parameters - input custom values

- Conditions - e.g. provision resources based on environment

- Resources - mandatory - the AWS resources to create

- Mappings - create custom mappings like Region : AMI

- Transforms - reference code located in S3 e.g. Lambda code or reusable snippets of CloudFormation code.

SAM (Serverless Application Model):


  • Allows you to define and provision serverless applications using CloudFormation

  • Uses the SAM CLI commands to package and deploy:
    sam package - packages your application and uploads to S3
    sam deploy - deploys your serverless app using CloudFormation

CloudFormation Nested Stacks

- Nested Stacks allow you to re-use your CloudFormation code so you don’t need to copy/paste every time

- Really useful for frequently used configurations, e.g. for load balancers, web or application servers

- Simply create a Cloud Formation template, store it in S3 and you can reference it in the Resources section of any CloudFormation template using the Stack resource type

Web Identity Federation:

- Federation allows users to authenticate with a Web Identity Provider (Google, Facebook, Amazon)

- The user authenticates first with the Web ID Provider and receives an authentication token, which is exchanged for temporary AWS credentials allowing them to assume an IAM role.

- Cognito is an Identity Broker which handles interaction between your applications and the Web ID provider (no code required)

- Provides sign-up, sign-in and guest user access

- Syncs user data for a seamless experience across your devices

- Cognito is the AWS recommended approach for Web ID Federation particularly for mobile apps


  • Cognito uses User Pools to manage user sign-up and sign-in directly or via Web Identity Providers

  • Cognito acts as an Identity broker, handling all interaction with Web Identity Providers

  • Cognito uses Push Synchronization to send a silent push notification of user data updates to multiple device types associated with a user ID

Advanced Policies:

- A Managed Policy is an IAM policy which is created and administered by AWS.

- A Customer Managed Policy is a standalone policy that you create and administer inside your own AWS account. You can attach this policy to multiple users, groups, and roles - but only within your own account.

- An Inline Policy is an IAM policy which is actually embedded within the user, group, or role to which it applies. There is a strict 1:1 relationship betwenn the entity and the policy.


  • In most cases, AWS recommends using Managed Policies over Inline Policies.

CloudWatch:

- CloudWatch is a monitoring service to monitor your AWS resources, as well as the applications that you run on AWS.

- RAM Utilization is a custom metric! By default EC2 monitoring is 5 minute intervals, unless you enable detailed monitoring which will then make it 1 minute intervals.

- You can retrieve data from any terminated EC2 or ELB isntance after its termination.

- For custom metrics the minimum granularity that you can have is 1 minute.

- Host level metrics consist of:

CPU

Network

Disk

Status Check

- CloudWatch Logs by default are stored indefinetely

- Metric Granularity

1 minute for detailed monitoring

5 minutes for standard monitoring

- Cloudwatch can be used on premise. (Need to download and install the SSM agent and Cloudwatch agent)

CloudWatch vs CloudTrail:

- CloudWatch monitors performance

- CloudTrail monitors API calls in the AWS platform

- AWS Config records the state of your AWS environment and can notify you of changes

Amazon CloudWatch Events is a web service that monitors your AWS resources and the applications you run on AWS. You can use Amazon CloudWatch Events to detect and react to changes in the state of a pipeline, stage, or action. Then, based on rules you create, CloudWatch Events invokes one or more target actions when a pipeline, stage, or action enters the state you specify in a rule.