Edited at

One mystery about Linux thread belonging

One day I found the SSH connection with user admin to one Debian server failed with "Broken Pipe".

In /var/log/auth.log.

Jan 18 11:00:06 ECS sshd[11661]: Accepted publickey for admin from **** port 46685 ssh2: RSA SHA256:*****

Jan 18 11:00:06 ECS sshd[11661]: pam_unix(sshd:session): session opened for user admin by (uid=0)
Jan 18 11:00:06 ECS systemd-logind[329]: New session 15055 of user admin.
Jan 18 11:00:06 ECS sshd[11661]: fatal: fork of unprivileged child failed
Jan 18 11:00:06 ECS systemd-logind[329]: Removed session 15055.

Seems that the thread max reached. Killed some processes of admin, still have problem.

In /var/log/kern.log

~

Jan 18 06:40:06 ECS kernel: [4295051.169062] cgroup: fork rejected by pids controller in /user.slice/user-1001.slice/session-14998.scope
Jan 18 07:00:05 ECS kernel: [4296250.758037] cgroup: fork rejected by pids controller in /user.slice/user-1001.slice/session-15001.scope
Jan 18 07:20:05 ECS kernel: [4297450.707943] cgroup: fork rejected by pids controller in /user.slice/user-1001.slice/session-15006.scope
Jan 18 07:40:05 ECS kernel: [4298650.788977] cgroup: fork rejected by pids controller in /user.slice/user-1001.slice/session-15010.scope
Jan 18 08:00:06 ECS kernel: [4299851.567733] cgroup: fork rejected by pids controller in /user.slice/user-1001.slice/session-15013.scope
Jan 18 08:20:06 ECS kernel: [4301051.427105] cgroup: fork rejected by pids controller in /user.slice/user-1001.slice/session-15018.scope
Jan 18 08:40:07 ECS kernel: [4302252.683723] cgroup: fork rejected by pids controller in /user.slice/user-1001.slice/session-15022.scope
Jan 18 09:00:06 ECS kernel: [4303451.522163] cgroup: fork rejected by pids controller in /user.slice/user-1001.slice/session-15025.scope
Jan 18 09:20:08 ECS kernel: [4304653.479030] cgroup: fork rejected by pids controller in /user.slice/user-1001.slice/session-15030.scope
Jan 18 09:24:22 ECS kernel: [4304907.389627] cgroup: fork rejected by pids controller in /user.slice/user-1001.slice/session-15031.scope
Jan 18 09:40:06 ECS kernel: [4305851.240987] cgroup: fork rejected by pids controller in /user.slice/user-1001.slice/session-15035.scope
Jan 18 10:00:06 ECS kernel: [4307051.114924] cgroup: fork rejected by pids controller in /user.slice/user-1001.slice/session-15038.scope
Jan 18 10:04:52 ECS kernel: [4307337.846374] cgroup: fork rejected by pids controller in /user.slice/user-1001.slice/session-15039.scope
Jan 18 10:20:07 ECS kernel: [4308252.217822] cgroup: fork rejected by pids controller in /user.slice/user-1001.slice/session-15044.scope
Jan 18 10:40:05 ECS kernel: [4309450.816160] cgroup: fork rejected by pids controller in /user.slice/user-1001.slice/session-15048.scope
Jan 18 11:00:06 ECS kernel: [4310651.717342] cgroup: fork rejected by pids controller in /user.slice/user-1001.slice/session-15055.scope

Emmm, have relationship with cgroup?

See cat /proc/sys/kernel/threads-max

63727

Check cat /sys/fs/cgroup/pids/user.slice/user-1001.slice/pids.current

10805

Check cat /sys/fs/cgroup/pids/user.slice/user-1001.slice/pids.max

10813

Current is too many... See session details.

wc -l /sys/fs/cgroup/pids/user.slice/user-1001.slice/session-*.scope/tasks

10772 /sys/fs/cgroup/pids/user.slice/user-1001.slice/session-14835.scope/tasks

31 /sys/fs/cgroup/pids/user.slice/user-1001.slice/session-14900.scope/tasks
10803 总用量

Determine the session with too many tasks: 14835

grep 14835 /proc/*/cgroup

/proc/25044/cgroup:8:pids:/user.slice/user-1001.slice/session-14835.scope

/proc/25044/cgroup:1:name=systemd:/user.slice/user-1001.slice/session-14835.scope

Process 25044? ps aux|grep 25044

root     12413  0.0  0.0  14528   980 pts/1    S+   12:12   0:00 grep 25044

tomcat 25044 47.2 35.3 21160052 2887884 ? Sl 1月17 506:57 /opt/jdk8/bin/java -Djava.util.logging.config.file=/opt/tomcat8/conf/logging.properties -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager -server -Xmx6g -Xms6g -Xmn2g -XX:MaxNewSize=2g -Xss1m -XX:MetaspaceSize=256m -XX:MaxMetaspaceSize=512m -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -Djdk.tls.ephemeralDHKeySize=2048 -Djava.protocol.handler.pkgs=org.apache.catalina.webresources -Dignore.endorsed.dirs= -classpath /opt/tomcat8/bin/bootstrap.jar:/opt/tomcat8/bin/tomcat-juli.jar -Dcatalina.base=/opt/tomcat8 -Dcatalina.home=/opt/tomcat8 -Djava.io.tmpdir=/opt/tomcat8/temp org.apache.catalina.startup.Bootstrap start

This process, 25044, is raised by admin, through /bin/su -s /bin/bash tomcat -c $CATALINA_HOME/bin/startup.sh in script. However, the session is still under admin.

Wonderful.