b.liu | e958203 | 2025-04-17 19:18:16 +0800 | [diff] [blame^] | 1 | ========================= |
| 2 | Process Number Controller |
| 3 | ========================= |
| 4 | |
| 5 | Abstract |
| 6 | -------- |
| 7 | |
| 8 | The process number controller is used to allow a cgroup hierarchy to stop any |
| 9 | new tasks from being fork()'d or clone()'d after a certain limit is reached. |
| 10 | |
| 11 | Since it is trivial to hit the task limit without hitting any kmemcg limits in |
| 12 | place, PIDs are a fundamental resource. As such, PID exhaustion must be |
| 13 | preventable in the scope of a cgroup hierarchy by allowing resource limiting of |
| 14 | the number of tasks in a cgroup. |
| 15 | |
| 16 | Usage |
| 17 | ----- |
| 18 | |
| 19 | In order to use the `pids` controller, set the maximum number of tasks in |
| 20 | pids.max (this is not available in the root cgroup for obvious reasons). The |
| 21 | number of processes currently in the cgroup is given by pids.current. |
| 22 | |
| 23 | Organisational operations are not blocked by cgroup policies, so it is possible |
| 24 | to have pids.current > pids.max. This can be done by either setting the limit to |
| 25 | be smaller than pids.current, or attaching enough processes to the cgroup such |
| 26 | that pids.current > pids.max. However, it is not possible to violate a cgroup |
| 27 | policy through fork() or clone(). fork() and clone() will return -EAGAIN if the |
| 28 | creation of a new process would cause a cgroup policy to be violated. |
| 29 | |
| 30 | To set a cgroup to have no limit, set pids.max to "max". This is the default for |
| 31 | all new cgroups (N.B. that PID limits are hierarchical, so the most stringent |
| 32 | limit in the hierarchy is followed). |
| 33 | |
| 34 | pids.current tracks all child cgroup hierarchies, so parent/pids.current is a |
| 35 | superset of parent/child/pids.current. |
| 36 | |
| 37 | The pids.events file contains event counters: |
| 38 | |
| 39 | - max: Number of times fork failed because limit was hit. |
| 40 | |
| 41 | Example |
| 42 | ------- |
| 43 | |
| 44 | First, we mount the pids controller:: |
| 45 | |
| 46 | # mkdir -p /sys/fs/cgroup/pids |
| 47 | # mount -t cgroup -o pids none /sys/fs/cgroup/pids |
| 48 | |
| 49 | Then we create a hierarchy, set limits and attach processes to it:: |
| 50 | |
| 51 | # mkdir -p /sys/fs/cgroup/pids/parent/child |
| 52 | # echo 2 > /sys/fs/cgroup/pids/parent/pids.max |
| 53 | # echo $$ > /sys/fs/cgroup/pids/parent/cgroup.procs |
| 54 | # cat /sys/fs/cgroup/pids/parent/pids.current |
| 55 | 2 |
| 56 | # |
| 57 | |
| 58 | It should be noted that attempts to overcome the set limit (2 in this case) will |
| 59 | fail:: |
| 60 | |
| 61 | # cat /sys/fs/cgroup/pids/parent/pids.current |
| 62 | 2 |
| 63 | # ( /bin/echo "Here's some processes for you." | cat ) |
| 64 | sh: fork: Resource temporary unavailable |
| 65 | # |
| 66 | |
| 67 | Even if we migrate to a child cgroup (which doesn't have a set limit), we will |
| 68 | not be able to overcome the most stringent limit in the hierarchy (in this case, |
| 69 | parent's):: |
| 70 | |
| 71 | # echo $$ > /sys/fs/cgroup/pids/parent/child/cgroup.procs |
| 72 | # cat /sys/fs/cgroup/pids/parent/pids.current |
| 73 | 2 |
| 74 | # cat /sys/fs/cgroup/pids/parent/child/pids.current |
| 75 | 2 |
| 76 | # cat /sys/fs/cgroup/pids/parent/child/pids.max |
| 77 | max |
| 78 | # ( /bin/echo "Here's some processes for you." | cat ) |
| 79 | sh: fork: Resource temporary unavailable |
| 80 | # |
| 81 | |
| 82 | We can set a limit that is smaller than pids.current, which will stop any new |
| 83 | processes from being forked at all (note that the shell itself counts towards |
| 84 | pids.current):: |
| 85 | |
| 86 | # echo 1 > /sys/fs/cgroup/pids/parent/pids.max |
| 87 | # /bin/echo "We can't even spawn a single process now." |
| 88 | sh: fork: Resource temporary unavailable |
| 89 | # echo 0 > /sys/fs/cgroup/pids/parent/pids.max |
| 90 | # /bin/echo "We can't even spawn a single process now." |
| 91 | sh: fork: Resource temporary unavailable |
| 92 | # |