0
1

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 5 years have passed since last update.

EKS 1.13+CNI 1.5 stuck with ContainerCreating

Last updated at Posted at 2019-08-08

#ContainerCreating - EKSとCNIをアップグレードした結果

先週事前にEKSで1.12を作った後、MasterとWorker Nodesをアップグレードするテストは行ってました。

この時のテストDev Clusterには

  • 5 namespaces
  • 3 pods
  • r5.xlarge x 2
  • EKS 1.12 -> 1.13
  • CNIは変わらず1.5

割とすんなり出来たので、金曜にStagingをアップグレードした結果、ContainerCreatingとなってうまくPodが走らなかった。

Staging cluster

  • 9 namespaces
  • 28 pods in each AZx2
  • r5.xlarge x 2
  • EKS 1.12 -> 1.13
  • CNI 1.32 -> 1.50

他社の運用例を知らないからこれが普通なのかわからないけど、うちはCloud FormationとEKSを使ってこんな形で運用してます。
jsonnet > compile > manifest yaml化 > deploy

これを機にEKS, ENI, CNI, Primary & Secondary IP, IPAMD, L-IPAM,その他のドキュメントを読みました。

##Official Document - Upgrade guide
https://github.com/awslabs/amazon-eks-ami
https://docs.aws.amazon.com/eks/latest/userguide/update-cluster.html
https://docs.aws.amazon.com/eks/latest/userguide/update-stack.html

##Proposal: CNI plugin for Kubernetes networking over AWS VPC
https://github.com/aws/amazon-vpc-cni-k8s/blob/master/docs/cni-proposal.md

##amazon-vpc-cni-k8s
https://github.com/aws/amazon-vpc-cni-k8s

2 components:

  • CNI Plugin
  • L-IPAMD

##Possible Issues
Pods stuck in ContainerCreating due to CNI Failing to Assing IP to Container Until aws-node is deleted #59
https://github.com/aws/amazon-vpc-cni-k8s/issues/59

Leaking Network Interfaces (ENI) #69
https://github.com/aws/amazon-vpc-cni-k8s/issues/69

##ENI and VPC

  • Each ENI has a description set as "aws-K8S-'instance-id'"
  • Can be attached to an instance in a VPC
  • The primary ENI IP address is automatically assigned
  • All secondary addresses remain unassigned and it's up to the host owner as to how to configure them.
  • Each instance can have multiple ENI and each ENI can have multiple IPv4 or IPv6 addresses.

##L-IPAM (node-Local IP Address Management)

  • a daemon which is responsible for:

    • maintaining a warm-pool of available IP addresses
    • assigning an IP address to a Pod
  • scenario 1 : available IP addresses < min threshold

    • create a new ENI and attach it to instance
    • allocate all available IP addresses on this new ENI
    • once these IP addresses become available -> add these IP addresses to warm-pool (instance's metadata service is used)
  • scenario 2 : available IP addresses > max threshold

    • pick an ENI where all of its secondary IP address are in warm-pool
    • detach the ENI interface and free it to EC2-VPC ENI pool

##Pod IP address cooling period

  • Used to prevent CNI plugin recycle this Pod's IP address and assign to a new Pod before controller has finished updating all nodes in the cluster about this deleted pod.
  • scenario : When a Pod is deleted
    • The Pod IP address -> "cooling mode" for a period for 30 seconds
    • When the cooling period expires, this Pod IP -> warm-pool (recycle)

##IPAMD (Internet Protocol address management)

  • Allocates ENIs and secondary IP addresses from the instance subnet.
  • If a subnet runs out of IP addresses
    • ipamD will not able to get secondary IP addresses -> may get stuck in "ContainerCreating"

##ENI Allocation
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-eni.html#AvailableIpPerENI

##Log Location
/var/log/aws-routed-eni

##Troubleshooting 便利コマンドまとめ

###ipamD debugging commands

collecting node level tech-support bundle for offline troubleshooting
/opt/cni/bin/aws-cni-support.sh

get enis info
curl http://localhost:61679/v1/enis | python -m json.tool

get IP assignment info
curl http://localhost:61679/v1/pods | python -m json.tool

get ipamD metrics
curl http://localhost:61678/metrics

###L-IPAM (Local IP Address Manager)

retrieve all attached ENIs
curl http://169.254.169.254/latest/meta-data/network/interfaces/macs/

retrieve all IPv4 addresses on an ENI
curl http://169.254.169.254/latest/meta-data/network/interfaces/macs/<MAC address>/local-ipv4s

###Inside a Pod
IP address
ip addr show

routes
ip route show

###On Host side
to Pod traffic
ip route show

pod is allocated with one of the ENI's secondary IP address
ip route show table eni-1

to and from Pods
ip rule list

##便利そうなもの
cni-metrics-helper
https://github.com/aws/amazon-vpc-cni-k8s/blob/master/cni-metrics-helper/README.md

##学んだこと

  • Node上でのトラブルシュートに便利なコマンド
  • r5.2xlargeだと以下なのでIPはまだ足りたはず。もう一度Devで作り直して今度は上のトラブルシュートに沿って調べていこうという話になった。

| API Name | Memory | vCPUs | Max IPs | Max ENIs |
|-----|-----|---|---|---|---|
| r5.2xlarge | 64.0 GiB | 8 vCPUs | 60 | 4 |

0
1
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
1

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?