b.liu | e958203 | 2025-04-17 19:18:16 +0800 | [diff] [blame^] | 1 | ========= |
| 2 | SafeSetID |
| 3 | ========= |
| 4 | SafeSetID is an LSM module that gates the setid family of syscalls to restrict |
| 5 | UID/GID transitions from a given UID/GID to only those approved by a |
| 6 | system-wide whitelist. These restrictions also prohibit the given UIDs/GIDs |
| 7 | from obtaining auxiliary privileges associated with CAP_SET{U/G}ID, such as |
| 8 | allowing a user to set up user namespace UID mappings. |
| 9 | |
| 10 | |
| 11 | Background |
| 12 | ========== |
| 13 | In absence of file capabilities, processes spawned on a Linux system that need |
| 14 | to switch to a different user must be spawned with CAP_SETUID privileges. |
| 15 | CAP_SETUID is granted to programs running as root or those running as a non-root |
| 16 | user that have been explicitly given the CAP_SETUID runtime capability. It is |
| 17 | often preferable to use Linux runtime capabilities rather than file |
| 18 | capabilities, since using file capabilities to run a program with elevated |
| 19 | privileges opens up possible security holes since any user with access to the |
| 20 | file can exec() that program to gain the elevated privileges. |
| 21 | |
| 22 | While it is possible to implement a tree of processes by giving full |
| 23 | CAP_SET{U/G}ID capabilities, this is often at odds with the goals of running a |
| 24 | tree of processes under non-root user(s) in the first place. Specifically, |
| 25 | since CAP_SETUID allows changing to any user on the system, including the root |
| 26 | user, it is an overpowered capability for what is needed in this scenario, |
| 27 | especially since programs often only call setuid() to drop privileges to a |
| 28 | lesser-privileged user -- not elevate privileges. Unfortunately, there is no |
| 29 | generally feasible way in Linux to restrict the potential UIDs that a user can |
| 30 | switch to through setuid() beyond allowing a switch to any user on the system. |
| 31 | This SafeSetID LSM seeks to provide a solution for restricting setid |
| 32 | capabilities in such a way. |
| 33 | |
| 34 | The main use case for this LSM is to allow a non-root program to transition to |
| 35 | other untrusted uids without full blown CAP_SETUID capabilities. The non-root |
| 36 | program would still need CAP_SETUID to do any kind of transition, but the |
| 37 | additional restrictions imposed by this LSM would mean it is a "safer" version |
| 38 | of CAP_SETUID since the non-root program cannot take advantage of CAP_SETUID to |
| 39 | do any unapproved actions (e.g. setuid to uid 0 or create/enter new user |
| 40 | namespace). The higher level goal is to allow for uid-based sandboxing of system |
| 41 | services without having to give out CAP_SETUID all over the place just so that |
| 42 | non-root programs can drop to even-lesser-privileged uids. This is especially |
| 43 | relevant when one non-root daemon on the system should be allowed to spawn other |
| 44 | processes as different uids, but its undesirable to give the daemon a |
| 45 | basically-root-equivalent CAP_SETUID. |
| 46 | |
| 47 | |
| 48 | Other Approaches Considered |
| 49 | =========================== |
| 50 | |
| 51 | Solve this problem in userspace |
| 52 | ------------------------------- |
| 53 | For candidate applications that would like to have restricted setid capabilities |
| 54 | as implemented in this LSM, an alternative option would be to simply take away |
| 55 | setid capabilities from the application completely and refactor the process |
| 56 | spawning semantics in the application (e.g. by using a privileged helper program |
| 57 | to do process spawning and UID/GID transitions). Unfortunately, there are a |
| 58 | number of semantics around process spawning that would be affected by this, such |
| 59 | as fork() calls where the program doesn???t immediately call exec() after the |
| 60 | fork(), parent processes specifying custom environment variables or command line |
| 61 | args for spawned child processes, or inheritance of file handles across a |
| 62 | fork()/exec(). Because of this, as solution that uses a privileged helper in |
| 63 | userspace would likely be less appealing to incorporate into existing projects |
| 64 | that rely on certain process-spawning semantics in Linux. |
| 65 | |
| 66 | Use user namespaces |
| 67 | ------------------- |
| 68 | Another possible approach would be to run a given process tree in its own user |
| 69 | namespace and give programs in the tree setid capabilities. In this way, |
| 70 | programs in the tree could change to any desired UID/GID in the context of their |
| 71 | own user namespace, and only approved UIDs/GIDs could be mapped back to the |
| 72 | initial system user namespace, affectively preventing privilege escalation. |
| 73 | Unfortunately, it is not generally feasible to use user namespaces in isolation, |
| 74 | without pairing them with other namespace types, which is not always an option. |
| 75 | Linux checks for capabilities based off of the user namespace that ???owns??? some |
| 76 | entity. For example, Linux has the notion that network namespaces are owned by |
| 77 | the user namespace in which they were created. A consequence of this is that |
| 78 | capability checks for access to a given network namespace are done by checking |
| 79 | whether a task has the given capability in the context of the user namespace |
| 80 | that owns the network namespace -- not necessarily the user namespace under |
| 81 | which the given task runs. Therefore spawning a process in a new user namespace |
| 82 | effectively prevents it from accessing the network namespace owned by the |
| 83 | initial namespace. This is a deal-breaker for any application that expects to |
| 84 | retain the CAP_NET_ADMIN capability for the purpose of adjusting network |
| 85 | configurations. Using user namespaces in isolation causes problems regarding |
| 86 | other system interactions, including use of pid namespaces and device creation. |
| 87 | |
| 88 | Use an existing LSM |
| 89 | ------------------- |
| 90 | None of the other in-tree LSMs have the capability to gate setid transitions, or |
| 91 | even employ the security_task_fix_setuid hook at all. SELinux says of that hook: |
| 92 | "Since setuid only affects the current process, and since the SELinux controls |
| 93 | are not based on the Linux identity attributes, SELinux does not need to control |
| 94 | this operation." |
| 95 | |
| 96 | |
| 97 | Directions for use |
| 98 | ================== |
| 99 | This LSM hooks the setid syscalls to make sure transitions are allowed if an |
| 100 | applicable restriction policy is in place. Policies are configured through |
| 101 | securityfs by writing to the safesetid/add_whitelist_policy and |
| 102 | safesetid/flush_whitelist_policies files at the location where securityfs is |
| 103 | mounted. The format for adding a policy is '<UID>:<UID>', using literal |
| 104 | numbers, such as '123:456'. To flush the policies, any write to the file is |
| 105 | sufficient. Again, configuring a policy for a UID will prevent that UID from |
| 106 | obtaining auxiliary setid privileges, such as allowing a user to set up user |
| 107 | namespace UID mappings. |