Hi!
i have a mixed set of containers (a few, not too many) and bare-metal services (quite a few) and i would like to monitor them.
I am using good old "monit" that monitors my network interfaces, filesystems status and traditional services (via pid files). It's not pretty, but get the work done. It seems i cannot find a way to have it also monitor my containers. Consider that i use podman and have a strict one service, one user policy (all containers are rootless).
I also run "netdata" but i find it overwhelming, too much data, too much graphics, just too much for my needs.
I need something that:
let me monitor service status
let me monitor containers status
let me restart services or containers (not mandatory, but preferred)
has a nice web GUI
the web gui is also mobile friendly (not mandatory, but appreciated)
Can print some history data (not manatory, but interesting)
Can monitor CPU usage (mandatory)
Can monitor filesystem usage (mandatory)
I don't care for authentication features, since it will be behind a reverse proxy with HTTPS and proxy authentication already.
I am not looking for a fancy and comples dashboard, but for something i can host on a secondary page that i open if/when i want to check stuff. Also, if the tool can be scripted or accessed via an API could be useful, so i would write some extractors to print something in a summary page in my own dashboard.
I think Prometheus is a good industry standard. It can do everything you listed except for restarting stuff. It's got a decent built-in monitoring capability and you can extend it trivially to monitor anything. For example I wrote a 5-liner to monitor ZFS health and another for LVM. I even monitor my routers with it. OpenWrt has an installable node exporter for Prometheus.
Service restarting is a remote execution capability and generally falls outside of the monitoring domain. You'd be better off implementing that with another process/service manager. If you're running systemd, that's one of its primary purposes. You can use it to start/stop/restart containers just like normal processes.
Can you share a guide / tutorial on how to accomplish what OP wants (or just get started with Prometheus)? I was in the same boat as OP and settled for netdata, and eventually gave up on monitoring altogether because it was either overwhelming me with data, too cumbersome to set up or had features behind paid plans.
Anytime you're asking this, go for the projects Quick Start / Getting Started doc. In this case here. If you're on a Debian based system Prometheus is already packaged in the repository so you don't have to download the latest. You likely won't win anything but the pain for having to set up the bare binary as a service with systemd. I followed that doc to setup mine but installed it from apt.
On a second thought, if you're getting it from the repo and it already has a systemd unit defined, it might be more difficult to follow the Getting Started doc. You know what, follow it as-is. Once you have something running and monitoring ad-hoc, it'll be easy to install from apt and put your config in it.
Give https://github.com/louislam/uptime-kuma a try. I'm planning to do the same for similar use case. Sensu (sensu.io) is a more sophisticated option but it requires more infrastructure and there is a bit of a learning curve with it.
While I really like uptime kuma, it seems a bit too restricted for OPs use case. For example, to monitor disk or CPU usage, you would need to write your own scripts. It would be doable, but not very nice.
At least how I understood the.question, OP would probably look for something like icinga.