OpenShift - oc command Tips - debug node

OpenShift の様々な操作で使用する oc コマンドの debug node をご紹介します。

debug node

debug node により Node 上で処理を実行することが出来ます。

$ oc debug --help
Launch a command shell to debug a running application.

 When debugging images and setup problems, it's useful to get an exact copy of a running pod configuration and
troubleshoot with a shell. Since a pod that is failing may not be started and not accessible to 'rsh' or 'exec', the
'debug' command makes it easy to create a carbon copy of that setup.

 The default mode is to start a shell inside of the first container of the referenced pod. The started pod will be a
copy of your source pod, with labels stripped, the command changed to '/bin/sh' for Linux containers or 'cmd.exe' for
Windows containers, and readiness and liveness checks disabled. If you just want to run a command, add '--' and a
command to run. Passing a command will not create a TTY or send STDIN by default. Other flags are supported for altering
the container or pod in common ways.

 A common problem running containers is a security policy that prohibits you from running as a root user on the cluster.
You can use this command to test running a pod as non-root (with --as-user) or to run a non-root pod as root (with

 You may invoke other types of objects besides pods - any controller resource that creates a pod (like a deployment,
build, or job), objects that can host pods (like nodes), or resources that can be used to create pods (such as image
stream tags), or simply pass '--image=IMAGE' to start a simple shell session in an image with a shell program

 The debug pod is deleted when the remote command completes or the user interrupts the shell.

  # Start a shell session into a pod using the OpenShift tools image
  oc debug

  # Debug a currently running deployment by creating a new pod
  oc debug deploy/test

  # Debug a node as an administrator
  oc debug node/master-1

  # Launch a shell in a pod using the provided image stream tag
  oc debug istag/mysql:latest -n openshift

  # Test running a job as a non-root user
  oc debug job/test --as-user=1000000

  # Debug a specific failing container by running the env command in the 'second' container
  oc debug daemonset/test -c second -- /bin/env

  # See the pod that would be created to debug
  oc debug mypod-9xbc -o yaml

  # Debug a resource but launch the debug pod in another namespace
  # Note: Not all resources can be debugged using --to-namespace without modification. For example,
  # volumes and service accounts are namespace-dependent. Add '-o yaml' to output the debug pod definition
  # to disk.  If necessary, edit the definition then run 'oc debug -f -' or run without --to-namespace
  oc debug mypod-9xbc --to-namespace testns

        If true, ignore any errors in templates when a field or map key is missing in the template. Only applies to
        golang and jsonpath output formats.

        If true, try to run the container as the root user

        Try to run the container as a specific user UID (note: admins may limit your ability to use this flag)

    -c, --container='':
        Container name; defaults to first container

        Must be "none", "server", or "client". If client strategy, only print the object that would be sent, without
        sending it. If server strategy, submit server-side request without persisting the resource.

    -f, --filename=[]:
        Filename, directory, or URL to files to read a template

        Override the image used by the targeted container.

        Specify an image stream (namespace/name:tag) containing a debug image to run.

        If true, keep the original pod annotations

        Run the init containers for the pod. Defaults to true.

        If true, keep the original pod labels

        If true, keep the original pod liveness probes

        If true, keep the original pod readiness probes

        If true, keep the original startup probes

    -k, --kustomize='':
        Process the kustomization directory. This flag can't be used together with -f or -R.

    -I, --no-stdin=false:
        Bypasses passing STDIN to the container, defaults to true if no command specified

    -T, --no-tty=false:
        Disable pseudo-terminal allocation

        Set a specific node to run on - by default the pod will run on any valid node

        If true, run only the selected container, remove all others

    -o, --output='':
        Output format. One of: (json, yaml, name, go-template, go-template-file, template, templatefile, jsonpath,
        jsonpath-as-json, jsonpath-file).

        If true, the pod will not be deleted after the debug command exits.

    -q, --quiet=false:
        No informational messages will be printed.

    -R, --recursive=false:
        Process the directory used in -f, --filename recursively. Useful when you want to manage related manifests
        organized within the same directory.

        When printing, show all labels as the last column (default hide labels column)

        If true, keep the managedFields when printing objects in JSON or YAML format.

        Template string or path to template file to use when -o=go-template, -o=go-template-file. The template format
        is golang templates [http://golang.org/pkg/text/template/#pkg-overview].

        Override the namespace to create the pod into (instead of using --namespace).

    -t, --tty=false:
        Force a pseudo-terminal to be allocated

  oc debug RESOURCE/NAME [ENV1=VAL1 ...] [-c CONTAINER] [flags] [-- COMMAND] [options]

Use "oc options" for a list of global command-line options (applies to all commands).

以下の要領で debug node を使用します。

$ oc get node
NAME           STATUS                     ROLES           AGE    VERSION
my-worker-01   Ready,SchedulingDisabled   master,worker   51d    v1.27.16+03a907c
my-worker-02   Ready                      master,worker   133d   v1.27.15+6147456
my-worker-z    Ready                      master,worker   133d   v1.27.15+6147456

$ oc debug node/my-worker-z
Starting pod/10244645-debug-tgllz ...
To use host binaries, run `chroot /host`
Pod IP: my-worker-z
If you don't see a command prompt, try pressing enter.
sh-4.4# pwd
sh-4.4# whoami
sh-4.4# chroot /host
sh-4.4# ps
   PID TTY          TIME CMD
105602 ?        00:00:00 sh
106081 ?        00:00:00 sh
106451 ?        00:00:00 ps
sh-4.4# exit
sh-4.4# exit

Removing debug pod ...

debug node を実行すると、Current Project に以下のような Pod が実行され、この Pod を介して Node にアクセスします。hostPath で、指定した Node (= Pod が実行されている Node) の / (root) を Pod 内 container-00 Container の /host にマウントしていることが分かります。

$ oc project -q

$ oc get pod -o wide
NAME                   READY   STATUS    RESTARTS   AGE   IP            NODE          NOMINATED NODE   READINESS GATES
10244645-debug-lvn2k   1/1     Running   0          34s   my-worker-z   <none>           <none>

$ oc -o yaml get pod 10244645-debug-lvn2k
apiVersion: v1
kind: Pod
    debug.openshift.io/source-container: container-00
    debug.openshift.io/source-resource: /v1, Resource=nodes/my-worker-z
  name: 10244645-debug-lvn2k
  namespace: default
  - command:
    - /bin/sh
    - name: TMOUT
      value: "900"
    image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ee9b4cbef856130e029b9c2a4b5b16d805d347579271743ac823f79dd40c02ef
    imagePullPolicy: IfNotPresent
    name: container-00
    resources: {}
      privileged: true
      runAsUser: 0
    stdin: true
    stdinOnce: true
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    tty: true
    - mountPath: /host
      name: host
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: kube-api-access-jmnlg
      readOnly: true
  dnsPolicy: ClusterFirst
  enableServiceLinks: true
  hostIPC: true
  hostNetwork: true
  hostPID: true
  - name: default-dockercfg-pvql5
  - name: all-icr-io
  nodeName: my-worker-z
  preemptionPolicy: PreemptLowerPriority
  priority: 1000000000
  priorityClassName: openshift-user-critical
  restartPolicy: Never
  schedulerName: default-scheduler
  securityContext: {}
  serviceAccount: default
  serviceAccountName: default
  terminationGracePeriodSeconds: 30
  - effect: NoExecute
    key: node.kubernetes.io/not-ready
    operator: Exists
    tolerationSeconds: 300
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    operator: Exists
    tolerationSeconds: 300
  - hostPath:
      path: /
      type: Directory
    name: host
  - name: kube-api-access-jmnlg
      defaultMode: 420
      - serviceAccountToken:
          expirationSeconds: 3607
          path: token
      - configMap:
          - key: ca.crt
            path: ca.crt
          name: kube-root-ca.crt
      - downwardAPI:
          - fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
            path: namespace
      - configMap:
          - key: service-ca.crt
            path: service-ca.crt
          name: openshift-service-ca.crt

なお、debu node の実行先 User は root である必要があります。

$ oc debug --as-user=65534 node/my-worker-z
error: cannot debug my-worker-z: can't debug nodes without running as the root user
$ echo $?

デフォルトの sh ではなく bash を指定することもできます。

$ oc debug -t node/my-worker-z -- bash
Starting pod/10244645-debug-bjjqd ...
To use host binaries, run `chroot /host`
Pod IP: my-worker-z
If you don't see a command prompt, try pressing enter.
[root@openshift-1023-node-worker-spacename-9990-cons-1111111 /]# chroot /host bash
[root@openshift-1023-node-worker-spacename-9990-cons-1111111 /]# ps
   PID TTY          TIME CMD
107593 ?        00:00:00 bash
108501 ?        00:00:00 bash
108581 ?        00:00:00 ps


$ oc debug -t node/my-worker-z -- cat /etc/hosts
Starting pod/10244645-debug-zn242 ...
To use host binaries, run `chroot /host`
# Kubernetes-managed hosts file (host network).
# Your system has configured 'manage_etc_hosts' as True.
# As a result, if you wish for changes to this file to persist
# then you will need to either
# a.) make changes to the master file in /etc/cloud/templates/hosts.redhat.tmpl
# b.) change or remove the value of 'manage_etc_hosts' in
#     /etc/cloud/cloud.cfg or cloud-config from user-data
# The following lines are desirable for IPv4 capable hosts openshift-1023-node-worker-spacename-9990-cons-1111111 openshift-1023-node-worker-spacename-9990-cons-1111111 localhost.localdomain localhost localhost4.localdomain4 localhost4

