My k3s cluster Poseidon
has been up and running for a few years now. Deploying quite a few updates to the host, k3s
, and all of my containers at
at this point. Thinking about my Prometheus install and the custom values I needed to apply in order to allow the metrics collection server to properly run,
I had grown slightly annoyed. Every time I wanted to update the helm chart, I needed to provide the YAML values. As I am adding more services, I don't wish to run into this problem again.
I was hoping that one of the updates to either my OS or cluster would enable me to no longer provide my custom override. I got curious and decided to dig in once more and see if I could find a better solution than having to customize each deployment that wanted that type of mount point. I wound up learning more about the mount command, OpenRC and shared drives along the way. In the end I was able to get a working solution with just a little effort!
Off the Shelf Errors
When I initially deployed Prometheus
via Helm I fell into an easy trap. Which chart do I choose? My lack of familiarity at the time lead me to use Lens
and it's built in chart exploring tool. Not experienced with Helm
repositories at the time, I didn't realize I was only viewing options from the Bitnami repositories.
Let this be a lesson to folks, always confirm your providers, you don't want to introduce a supply chain attack by installing a risky chart. Bitnami
is fortunately a known entity (which is why I didn't initially catch the mistake). However the one that I would recommend and currently use is the chart provided by Prometheus Community.
Post installation, I immediately hit an issue with the prometheus-node-exporter
DaemonSet
preventing the container from launching.
Error: failed to generate container "c46b46078475a3315593621d5efd4a5bf8c418f79a96526d98d4388c36a35573" spec: failed to generate speproc: path "/" is mounted on "/" but it is not a shared or slave mount
Shared Mounts?
As long as I have been using Linux, I had not before come across shared mounts. This was a good opportunity to expand my understanding while solving my problem.
The main thing to know about a shared mount
is that it enables remounting of the same file system at a new path with bidirectional propagation events.
This is used by Prometheus in order to gather metrics about the node itself. By mounting the root path into the container it can extrapolate data from the node rather than just the containers.
Solutions
Option one is to adjust the Helm
chart on install to enable the metrics container to deploy. One way to do this is via the CLI
. This is quick and easy to test, and enabled me to narrow in if I was experiencing the same issue as other folks online. However it increases deployment verbosity and requires staying on top of the deployment if it changes. The next way to attack this issue is using YAML values
to provide overrides. This is currently better from an Ops
perspective and more maintainable long term. However both of these approaches effectively just disable the root mount request.
According to the maintainer this is not the recommended solution.
The proper solution would be to configure the mount
to be shared.
The benefit would be that I only have to configure it on the host one time (per node). Additionally it doesn't harm compatibility with YAML
overrides.
Swimming Upstream
Having identified a path I wanted to follow, I needed to research how to properly configure the settings on my system. This is when I learned that OpenRC
did not support shared mounts.
To be clear, the mount
command did, it however was not supported as a parameter in fstab.
This is needed for the initial mounting of the root partition to be correct.
Digging in I was lucky enough that someone else had already done the heavy lifting and even opened a PR with OpenRC in order to add the option. The minor problem being that the PR was created back in mid 2022. Here I was 2 years later hoping to stand on shoulders and push this thing over the line.
I commented on the thread, tagged the maintainers, and OP. Tested and confirmed the fix on the most recent edition of Alpine as well. This re-engagement caused both the OP and others to comment again but this didn't last. The PR went quiet again. I tried tagging some devs based on previous approved PRs. Another month goes by and I am still the most recent comment. I decided to search again for other possibly more active developers in the repo.
This was a successful approach as the next dev I tagged ended up engaging with OP. After some healthy collaboration on the PR and an improved understanding by the maintainers, the PR was merged! This means OpenRC
releases >= 0.55
now support shared mounts! All it really took from me is pushing on the PR and not letting the hard work go to waste.
Impacts
So what? Now I have metrics on my cluster! I can more easily monitor my resource use in real time. If I decide to move away from Prometheus
, or another project also needs to take advantage of shared mounts, it should work. Another thing I came across is how this affects folks running Podman and how this might help with certain situations. However I don't use Podman
much currently so I can't confirm if that's the case.
Additionally the work being merged upstream means everyone gets to benefit. Alpine
being a more rock solid distribution for cloud workloads is a good thing ultimately. Not just because I use it.
Alpine
as a distribution is much smaller than Debian, including slim versions. The benefit is a lot less data over the wire when pulling images, fewer charges for network traffic, faster updates, and better resource utilization.
And while I do like the idea of a clean implementation of libc
via musl, it still has a lot of issues with certain tooling (looking at you Python packaging) and runtimes. The only way this is improved is if more folks use and OS that uses musl libc
, like Alpine
. More bugs will be fixed, documentation made, and tools improved.