The old way of writing the file omited qoutes within strings which are needed by some configurations like federations.
The quotes got lost when `echo`ing the content via `echo '${builtins.toJSON x}'`.
The pkgs.formats.json does handle that race condition properly, so this commit switches the writing to that helper.
Fixes race conditions like this:
> systemd[1]: Started prometheus-kea-exporter.service.
> kea-exporter[927]: Listening on http://0.0.0.0:9547
> kea-exporter[927]: Socket at /run/kea/dhcp4.sock does not exist. Is Kea running?
> systemd[1]: prometheus-kea-exporter.service: Main process exited, code=exited, status=1/FAILURE
When no devices are given the exporter tries to autodiscover available
disks. The previous DevicePolicy was however preventing the exporter
from accessing any device at all, since only explicitly mentioned ones
were allowed.
This commit adds an allow rule for several device classes that I could
find on my machines, that gets set when no devices are explicitly
configured.
There is an existing problem with nvme devices, that expose a character
device at `/dev/nvme0`, and a (namespaced) block device at
`/dev/nvme0n1`. The character device does not come with permissions that
we could give to the exporter without further impacting the hardening.
crw------- 1 root root 247, 0 27. Jan 03:10 /dev/nvme0
brw-rw---- 1 root disk 259, 0 27. Jan 03:10 /dev/nvme0n1
The autodiscovery only finds the character device, which the exporter
unfortunately does not have access to.
However a simple udev rule can be used to resolve this:
services.udev.extraRules = ''
SUBSYSTEM=="nvme", KERNEL=="nvme[0-9]*", GROUP="disk"
'';
Unfortunately I'm not fully aware of the security implications this
change carries and we should question upstream (systemd) why they did
not include such a rule.
The disk group has no members on any of my machines.
❯ getent group disk
disk❌6:
Otherwise, `postfix_up{path="/var/lib/postfix/queue/public/showq"}` will
always be `0` indicating an postfix outage because this is a unix domain
socket that cannot be connected to:
2021/12/03 14:50:46 Failed to scrape showq socket: dial unix /var/lib/postfix/queue/public/showq: socket: address family not supported by protocol
The option `services.prometheus.environmentFile` has been removed since it was causing [issues](https://github.com/NixOS/nixpkgs/issues/126083) and Prometheus now has native support for secret files.
The new option `services.prometheus.enableReload` has been introduced
which, when enabled, causes the prometheus systemd service to reload
when its config file changes.
More specifically the following property holds: switching to a
configuration (`switch-to-configuration`) that changes the prometheus
configuration only finishes successully when prometheus has finished
loading the new configuration.
`enableReload` is `false` by default in which case the old semantics
of restarting the prometheus systemd service are in effect.
9fea6d4c85 broke rtl_433-exporter by
introducing several hardening options which do not play well with
rtl_433 requiring writing to USB. More precisely, rtl_433 requires
(a) AF_NETLINK to configure the radio; (b) access to the USB device,
but PrivateDevices=true hides them; (c) rw access to the USB device,
but DeviceAllow= block-lists everything.
This commit was tested on real hardware with a standard NixOS setup.
The timex collector (enabled by default) needs the
adjtimex syscall, which was disabled by
9fea6d4c85.
So allow it unless the timex collector is disabled.
The systemd collector needs AF_UNIX to talk to
/var/run/dbus/system_bus_socket, which was broken
with 9fea6d4c85.
This commit allows AF_UNIX when needed.
In 0.3.0 of the json-exporter[1] it was switched to a different jsonpath
library which made some changes - especially for spaces in keys -
necessary. Also I decided to remove the pretty-printed JSON as this
would interfere with the bash quoting too much. If one needs
pretty-printed output, they can still pipe the output to `jq`.
[1] https://github.com/prometheus-community/json_exporter/releases/tag/v0.3.0
If `openFirewall = true`, but no `firewallFilter` is set, the evaluation
fails with the following error:
The option `services.prometheus.exporters.node.firewallFilter` is defined both null and
not null, in `/home/ma27/Projects/nixpkgs/nixos/modules/services/monitoring/prometheus/exporters.nix'
and `/home/ma27/Projects/nixpkgs/nixos/modules/services/monitoring/prometheus/exporters.nix'.
Originally introduced by me in #115185. The problem is that
`mkOptionDefault` has - as its name suggests - the same priority as the
default-value of the option `firewallFilter` and thus it conflicts if
this declaration and the actual default value are set which is the case
if `firewallFilter` isn't specified somewhere else.
Use new command-line flags of release 0.3.0 and always answer with the
expected XML in the VM test instead of using a test-specific fixed path.
Co-authored-by: ajs124 <git@ajs124.de>
* Content of `programlisting` shouldn't be indented, otherwise it's
weirdly indented in the output.
* Use `<xref linkend=.../>` in the release notes: then users can
directly go to the option documentation when reading release notes.
* Don't use docbook tags in `mkRemovedOptionModule`: it's only used
during evaluation where docbook isn't rendered.
For the same reason Alertmanager supports environmentFile to pass
secrets along, it is useful to support the same for Prometheus'
configuration to store bearer tokens outside the Nix store.
The format of the listenAddress option was recently changed to separate
the address and the port parts. There is now a legacy check that
tells users to update to the new format. This legacy check produces
a false positive on IPv6 addresses, since they contain colons.
Fix the regex to make it not match colons within IPv6 addresses.
systemd.exec(5) on DynamicUser:
> If a statically allocated user or group of the configured name
> already exists, it is used and no dynamic user/group is allocated.
Using DynamicUser while still setting a group name can be
useful for granting access to resources that can otherwise only be
accessed with entirely static IDs.
The postfix exporter needs to access postfix's `queue/public/` directory
to read the `showq` socket inside. Instead of making the public
directory world accessible, this sets the postfix exporter's group to
`postdrop` by default, when the postfix service is enabled.
Accessing the configured port of a service is quite useful, for example
when configuring virtual hosts for a service. The prometheus module did
not expose the configured por separately, making it unnecessarily
cumbersome to consume.
This is a breaking change only if you were setting `listenAddress` to
a non-standard value. If you were, you should now set `listenAddress`
and `port` separately.
If the host network stack is slow to start, the alertmanager fails to
start with this error message:
caller=main.go:256 msg="unable to initialize gossip mesh" err="create memberlist: Failed to get final advertise address: No private IP address found, and explicit IP not provided"
This bug can be reproduced by shutting down the network stack and
restarting the alertmanager.
Note I don't know why I didn't hit this issue with previous
alertmanager releases.