This ensures the tmpfiles resetup service has permissions
to create suid/sgid files, even if `DefaultRestrictSUIDSGID`
is set in system.conf. This is required, as tmpfiles
are used to e.g. set file permissions on the journal
directory.`DefaultRestrictSUIDSGID` is a new feature
coming in systemd 258 [1].
[1] https://github.com/systemd/systemd/pull/38126
Before this change, systemd-oomd startup was flaky at least with
either systemd-sysusers or userborn enabled. It would restart several
times until users were provisioned, so that it finally succeeded.
An alternative would be to use a DynamicUser which was my first
approach, before I discovered that upstream added the after statement
in Dec 2024[1]. DynamicUsers could have further
implications (sandboxing, etc), so we follow upstream here.
It's not clear to me we why Upstreams "After=systemd-sysusers.service"
doesn't show up on nixos-unstable systems (systemd v257.6).
Userborn is covered, as its unit is aliased to systemd-sysusers.service.
The following test succeeded after this change on x86_64-linux:
nix-build -A nixosTests.systemd-oomd
[1]: 36dd429680
When running with a xfs root partition and using systemd for stage 1
initrd, I noticed in journalctl that fsck.xfs always failed to execute.
The issue is that it is trying to use the below sh interpreter:
`#!/nix/store/xy4jjgw87sbgwylm5kn047d9gkbhsr9x-bash-5.2p37/bin/sh -f`
but the file does not exist in the initrd image.
/nix/store/xy4jjgw87sbgwylm5kn047d9gkbhsr9x-bash-5.2p37/bin/**bash**
exists since it gets pulled in by some package, but the rest of the
directory is not being pulled in.
boot/systemd/initrd.nix mentions that xfs_progs references the sh
interpreter and seems to explicitly try to address this by adding
${pkgs.bash}/bin to storePaths, but that's the wrong bash package.
Update the `storePaths` value to pull in `pkgs.bashNonInteractive`
rather than `pkgs.bash`.
Fixes#361592.
I was able to test this change by doing the following:
1. Create a file named “test-systemd-run0.nix” that contains this Nix
expression:
let
nixpkgs = /path/to/nixpkgs;
pkgs = import nixpkgs { };
in
pkgs.testers.runNixOSTest {
name = "test-systemd-run0";
nodes.machine = {
security.polkit.enable = true;
};
testScript = ''
start_all()
machine.succeed("run0 env")
'';
}
2. Replace “/path/to/nixpkgs” with the actual path to an actual copy of
Nixpkgs.
3. Run the integration test by running this command:
nix-build <path to test-systemd-run0.nix>
`user-.slice` does not seem to exist, and the config we generate for it is
rejected by systemd (see `systemctl status user-.slice`).
I suppose that what was really intended here, was to configure
`user.slice`, which is the one that is documented in `man systemd.special`.
Reported-by: Ian Sollars <Ian.Sollars@brussels.msf.org>
We should not remount all filesystem types since not all filesystems
are safe to remount and some (nfs) return errors if remounted with
certain mount options.
The enable attribute of `boot.initrd.systemd.contents.<name>` was
ignored for building initrd storePaths. This resulted in building
derivations for the initrd even if it was disabled.
Found while testing a to build a nixos system with a kernel without
lodable modules[0]
[0]: https://github.com/NixOS/nixpkgs/pull/411792
We currently bypass systemd's switch-root logic by premounting
/sysroot/run. Make sure to propagate its sub-mounts with the recursive
flag, in accordance with the default switch-root logic.
This is required for creds at /run/credentials to survive the transition
from initrd -> host.
I was confused why I could not get an emergency access console despite setting systemd.emergencyMode=true.
Turns out there is another similar option `boot.initrd.systemd.emergencyAccess` that I should have used.
This is confusing and this change should make it more clear vie the docs of both these options.
Containers did not have *systemd-journald-audit.socket* in *additionalUpstreamSystemUnits*, which meant that the unit was not provided.
However the *wantedBy* was added without any additional check, therefore creating an empty unit with just the *WantedBy* on *boot.isContainer* machines.
This caused `systemd-analyze verify` to fail:
```text
systemd-journald-audit.socket: Unit has no Listen setting (ListenStream=, ListenDatagram=, ListenFIFO=, ...). Refusing.
systemd-journald-audit.socket: Cannot add dependency job, ignoring: Unit systemd-journald-audit.socket has a bad unit file setting.
systemd-journald-audit.socket: Cannot add dependency job, ignoring: Unit systemd-journald-audit.socket has a bad unit file setting.
```
The upstream unit already contains the following, which should make it safe to include regardless:
```ini
[Unit]
ConditionSecurity=audit
ConditionCapability=CAP_AUDIT_READ
```
For reference, this popped up in the context of #[360426](https://redirect.github.com/NixOS/nixpkgs/issues/360426) as well as #[407696](https://redirect.github.com/NixOS/nixpkgs/pull/407696).
Co-authored-by: Bruce Toll <4109762+tollb@users.noreply.github.com>
Signed-off-by: benaryorg <binary@benary.org>
This is part of security-in-depth.
No suid binaries or devices should ever be in the nix store.
If they are, something is seriously wrong.
Disallowing this from a file system level should be non-breaking.
Thank you for making this change.
Unfortunately, and I take blame for this, this change to the module
system was not reviewed and approved by the module system maintainers.
I'm supportive of this change, but extending it on the staging-next
branch is not the right place.
This commit is also here to make sure that we don't run into conflicts
or other git trouble with the staging workflow.
Review:
It looks alright, but it didn't have tests yet, and it should be
considered in a broader context where the existence of this type
creates an incentive to be used in cases where the `<attr> = false;`
case is undesirable. I'd like to complement this with an type that
has `<attr> = {};` only.
My apologies for the lack of a timely and clear review. Often we
recommend to define the type outside the module system until
approved. This commit puts us back in that state.
attrNamesToTrue was introduced in 98652f9a90
After RFC-0125 implementation, Determinate Systems was pinged multiple
times to transfer the repository ownership of the tooling to a
vendor-neutral repository.
Unfortunately, this never manifested. Additionally, the leadership of
the NixOS project was too dysfunctional to deal with this sort of
problem. It might even still be the case up to this day.
Nonetheless, nixpkgs is about enabling end users to enact their own
policies. It would be better to live in a world where there is one
obvious choice of bootspec tooling, in the meantime, we can live in a
world where people can choose their bootspec tooling.
The Lix forge possess one fork of the Bootspec tooling:
https://git.lix.systems/lix-community/bootspec which will live its own
life from now on.
Change-Id: I00c4dd64e00b4c24f6641472902e7df60ed13b55
Signed-off-by: Raito Bezarius <masterancpp@gmail.com>
systemd-repart can be configured to not automatically issue BLKDISCARD commands
to the underlying hardware.
This PR exposes this option in the repart module.
The systemd.tmpfiles.settings.<name>.<path>.<type>.argument option may
contain arbitrary strings. This could allow intentional or unintentional
introduction of new configuration lines.
The argument field cannot be quoted, C‐style \xNN escape sequences are
however permitted. By escaping whitespace and newline characters, the
issue can be mitigated.
Format all Nix files using the officially approved formatter,
making the CI check introduced in the previous commit succeed:
nix-build ci -A fmt.check
This is the next step of the of the [implementation](https://github.com/NixOS/nixfmt/issues/153)
of the accepted [RFC 166](https://github.com/NixOS/rfcs/pull/166).
This commit will lead to merge conflicts for a number of PRs,
up to an estimated ~1100 (~33%) among the PRs with activity in the past 2
months, but that should be lower than what it would be without the previous
[partial treewide format](https://github.com/NixOS/nixpkgs/pull/322537).
Merge conflicts caused by this commit can now automatically be resolved while rebasing using the
[auto-rebase script](8616af08d9/maintainers/scripts/auto-rebase).
If you run into any problems regarding any of this, please reach out to the
[formatting team](https://nixos.org/community/teams/formatting/) by
pinging @NixOS/nix-formatting.
Previously, all generations for the primary system profile
read their data from the currently active one rather than
their own path, and specialisations in general all used
their parent bootspec rather than their own. This fixes both issues.
This commit still uses the parent path's build date for
specialisations, but this is more minor issue and the times
shouldn't be meaningfully different in most cases anyways.
Some upstream systemd units are conditionally installed into the systemd
output, so we must make sure the feature that enables their installation
is enabled on our side prior to trying to use them.
tracefs is a special-purpose filesystem in Linux used for tracing filesystem and kernel operations.
This was added to the kernel back in 2015 to replace debugfs. For security reasons, some system do not mount debugfs at all. Tracefs reduces the attack surface by allowing to trace without mounting debugfs. Additionally it provides features not supported by debugfs (such as calls for mkdir and rmdir
Debian and Arch Linux both enable this by default.
RHEL 8 and later, they enable tracefs by default.
Signed-off-by: John Titor <50095635+JohnRTitor@users.noreply.github.com>
Closes#381822
Apparently, I swapped `path` and `tmpfiles-type` in
2be50b1efe. Sorry about that 🫠
Also giving
`systemd.tmpfiles.settings.<config-name>.<path>.<tmpfiles-type>.type` a
better default in the manual than `‹name›`, i.e. `‹tmpfiles-type›` so
that it corresponds to the placeholders in the attribute path.
We default this option to null ; which is different
from upstream which defaults this to true.
Defaulting this to true leads to log-spam in /dev/kmesg
and thus in our opinion is a bad default https://github.com/systemd/systemd/issues/15324
We need to take the "top" mount instead of any mount, which is the last
line printed by findmnt. Additionally, make the regex more strict, so we
don't select mount options ending in ro (like `errors=remount-ro` from
ext4, or overlay paths ending in 'ro') and accidentally leave the Nix
store RW after boot.
Prior to this change a service failure would occur when this tmpfiles
service did not finish fast enough and receive a SIGTERM from systemd.
Additionally, `initrd-nixos-activation` is already ordered with
`After=initrd-switch-root.target`.
By default, systemd-repart refuses to act on empty disk devices, i.e.
those without any existing partition table for safety reasons.
This behaviour can be customized via the `--empty` flag, which we now
expose via the module system. This makes to partition empty disks
on first boot.
These daemons should not be stopped, as they're foundational to a
proper functioning of the system. When switching configurations, they
only need a restart instead of that stop/start cycle.
Helps the following situation:
- SSH in initrd is enabled
- NixOS is waiting for a password to be typed at the console (or
provided via cryptsetup-askpass)
- The user logs in via SSH, but instead of running cryptsetup-askpass,
they run "cryptsetup open" directly (because they don't know that
they need to use NixOS's cryptsetup-askpass script, or because they
want to use a non-trivial unlocking method that is not natively
supported by this module)
Currently, in the above situation, NixOS will keep waiting for a
password to be entered even though the device is already unlocked. If
a password is entered, it will print a confusing "already exists"
error and keep asking for the same password.
We can improve on this by simply checking if the device is already
unlocked in our read loop. In this case, we don't need to do anything
other than return from the function and continue booting.
Removing the splash param only causes plymouth to display console
output by default; it still runs. Systemd stage 1 respects this flag
due to unit conditions preventing plymouth from even running. So this
brings parity to scripted stage 1.