In user namespaces where an unprivileged user is mapped as root and root
is unmapped, setuid bits have no effect. However setuid root
executables like mount are still usable *in the namespace* as the user
already has the required privileges. This commit detects the situation
where the wrapper gained no privileges that the parent process did not
already have and in this case does less sanity checking. In short there
is no need to be picky since the parent already can execute the foo.real
executable themselves.
Details:
man 7 user_namespaces:
Set-user-ID and set-group-ID programs
When a process inside a user namespace executes a set-user-ID
(set-group-ID) program, the process's effective user (group) ID
inside the namespace is changed to whatever value is mapped for
the user (group) ID of the file. However, if either the user or
the group ID of the file has no mapping inside the namespace, the
set-user-ID (set-group-ID) bit is silently ignored: the new
program is executed, but the process's effective user (group) ID
is left unchanged. (This mirrors the semantics of executing a
set-user-ID or set-group-ID program that resides on a filesystem
that was mounted with the MS_NOSUID flag, as described in
mount(2).)
The effect of the setuid bit is that the real user id is preserved and
the effective and set user ids are changed to the owner of the wrapper.
We detect that no privilege was gained by checking that euid == suid
== ruid. In this case we stop checking that euid == owner of the
wrapper file.
As a reminder here are the values of euid, ruid, suid, stat.st_uid and
stat.st_mode & S_ISUID in various cases when running a setuid 42 executable as user 1000:
Normal case:
ruid=1000 euid=42 suid=42
setuid=2048, st_uid=42
nosuid mount:
ruid=1000 euid=1000 suid=1000
setuid=2048, st_uid=42
inside unshare -rm:
ruid=0 euid=0 suid=0
setuid=2048, st_uid=65534
inside unshare -rm, on a suid mount:
ruid=0 euid=0 suid=0
setuid=2048, st_uid=65534
C's assert macro only works when NDEBUG is undefined. Previously
NDEBUG was undefined incorrectly which meant that the assert
macros in wrapper.c did not work.
With libcap 2.41 the output of cap_to_text changed, also the original
author of code hoped that this would never happen.
To counter this now the security-wrapper only relies on the syscall
ABI, which is more stable and robust than string parsing. If new
breakages occur this will be more obvious because version numbers will
be incremented.
Furthermore all errors no make execution explicitly fail instead of
hiding errors behind debug environment variables and the code style was
more consistent with no goto fail; goto fail; vulnerabilities (https://gotofail.com/)