Upstream patches and additional patching (#2)

* Ensures DSM GUI refreshes its updates

* Removed whale icon and changed verbosity

* Added addon for Prometheus+node_exporter

* Changed local image check to check on image ID rather than name

* Update podcheck.sh

changed docker->podman, typo

* - **v0.6.0**:
    - **Grafana & Prometheus Integration:**
      - Added a detailed Prometheus metrics exporter that now reports not only the number of containers with updates, no-updates, and errors, but also the total number of containers checked, the duration of the update check, and the epoch timestamp of the last check.
      - Enhanced documentation with instructions on integrating these metrics with Grafana for visual monitoring.
    - **Improved Error Handling & Code Refactoring:**
      - Introduced `set -euo pipefail` and local variable scoping within functions to improve reliability and prevent unexpected behaviour.
      - Standardised container name handling and refined the Quadlet detection logic.
    - **Self-Update Enhancements:**
      - Updated the self-update mechanism to support both Git-based and HTTP-based updates, with an automatic restart that preserves the original arguments.
    - **Miscellaneous Improvements:**
      - Enhanced dependency installer to support both package manager and static binary installations for `jq` and `regctl`.
      - General code refactoring across the project for better readability and maintainability.

* Update podcheck.sh

* increment version

* Update Quadlet detection logic 

Update Quadlet detection logic to support flexible service naming

- Modified the quadlet update block to first try an exact match for "$i.service".
- If no exact match is found, build a regex pattern from the container name (allowing underscores and hyphens interchangeably) and search user service units.
- When multiple candidate units are found, the script attempts to choose the one that exactly matches (ignoring case) or defaults to the first candidate.
- This update allows containers like "containera" to match service units named "container_a.service" and supports scenarios with multiple counterparts (e.g., matrix-a, matrix-b, matrix_db).

* search name fix

* fixes to arg parsing

* Logic overhaul, verbose output and better syntax

* Added support for prometheus

---------

Co-authored-by: mag37 <robin.ivehult@gmail.com>
This commit is contained in:
Joe Harrison 2025-02-25 14:12:01 +00:00 committed by GitHub
parent 053c587bf5
commit a7dcd975b2
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
12 changed files with 953 additions and 245 deletions

View file

@ -0,0 +1,61 @@
## [Prometheus](https://github.com/prometheus/prometheus) and [node_exporter](https://github.com/prometheus/node_exporter)
Podcheck check is capable to export metrics to prometheus via the text file collector provided by the node_exporter.
In order to do so the -c flag has to be specified followed by the file path that is configured in the text file collector of the node_exporter.
A simple cron job can be configured to export these metrics on a regular interval as shown in the sample below:
```
0 1 * * * /root/podcheck.sh -n -c /var/lib/node_exporter/textfile_collector
```
The following metrics are exported to prometheus
```
# HELP podcheck_images_analyzed Podman images that have been analyzed
# TYPE podcheck_images_analyzed gauge
podcheck_images_analyzed 22
# HELP podcheck_images_outdated Podman images that are outdated
# TYPE podcheck_images_outdated gauge
podcheck_images_outdated 7
# HELP podcheck_images_latest Podman images that are outdated
# TYPE podcheck_images_latest gauge
podcheck_images_latest 14
# HELP podcheck_images_error Podman images with analysis errors
# TYPE podcheck_images_error gauge
podcheck_images_error 1
# HELP podcheck_images_analyze_timestamp_seconds Last podcheck run time
# TYPE podcheck_images_analyze_timestamp_seconds gauge
podcheck_images_analyze_timestamp_seconds 1737924029
```
Once those metrics are exported they can be used to define alarms as shown below
```
- alert: podcheck_images_outdated
expr: sum by(instance) (podcheck_images_outdated) > 0
for: 15s
labels:
severity: warning
annotations:
summary: "{{ $labels.instance }} has {{ $value }} outdated podman images."
description: "{{ $labels.instance }} has {{ $value }} outdated podman images."
- alert: podcheck_images_error
expr: sum by(instance) (podcheck_images_error) > 0
for: 15s
labels:
severity: warning
annotations:
summary: "{{ $labels.instance }} has {{ $value }} podman images having an error."
description: "{{ $labels.instance }} has {{ $value }} podman images having an error."
- alert: podcheck_image_last_analyze
expr: (time() - podcheck_images_analyze_timestamp_seconds) > (3600 * 24 * 3)
for: 15s
labels:
severity: warning
annotations:
summary: "{{ $labels.instance }} has not updated the podcheck statistics for more than 3 days."
description: "{{ $labels.instance }} has not updated the podcheck statistics for more than 3 days."
```
There is a reference Grafana dashboard in [grafana/grafana_dashboard.json](./grafana/grafana_dashboard.json).
![](./grafana/grafana_dashboard.png)

View file

@ -0,0 +1,382 @@
{
"__inputs": [
{
"name": "DS_PROMETHEUS",
"label": "prometheus",
"description": "",
"type": "datasource",
"pluginId": "prometheus",
"pluginName": "Prometheus"
}
],
"__elements": {},
"__requires": [
{
"type": "grafana",
"id": "grafana",
"name": "Grafana",
"version": "11.4.0"
},
{
"type": "datasource",
"id": "prometheus",
"name": "Prometheus",
"version": "1.0.0"
},
{
"type": "panel",
"id": "table",
"name": "Table",
"version": ""
}
],
"annotations": {
"list": [
{
"builtIn": 1,
"datasource": {
"type": "grafana",
"uid": "-- Grafana --"
},
"enable": true,
"hide": true,
"iconColor": "rgba(0, 211, 255, 1)",
"name": "Annotations & Alerts",
"type": "dashboard"
}
]
},
"editable": true,
"fiscalYearStartMonth": 0,
"graphTooltip": 0,
"id": null,
"links": [],
"panels": [
{
"datasource": {
"type": "prometheus",
"uid": "${DS_PROMETHEUS}"
},
"fieldConfig": {
"defaults": {
"color": {
"mode": "thresholds"
},
"custom": {
"align": "auto",
"cellOptions": {
"type": "auto"
},
"inspect": false
},
"mappings": [],
"thresholds": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "red",
"value": 80
}
]
}
},
"overrides": [
{
"matcher": {
"id": "byName",
"options": "last_analyze_timestamp"
},
"properties": [
{
"id": "unit",
"value": "dateTimeAsIso"
}
]
},
{
"matcher": {
"id": "byName",
"options": "last_analyze_since"
},
"properties": [
{
"id": "unit",
"value": "s"
},
{
"id": "custom.cellOptions",
"value": {
"mode": "gradient",
"type": "color-background"
}
},
{
"id": "thresholds",
"value": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "red",
"value": 259200
}
]
}
}
]
},
{
"matcher": {
"id": "byName",
"options": "images_outdated"
},
"properties": [
{
"id": "custom.cellOptions",
"value": {
"mode": "gradient",
"type": "color-background"
}
},
{
"id": "thresholds",
"value": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "red",
"value": 1
}
]
}
}
]
},
{
"matcher": {
"id": "byName",
"options": "images_error"
},
"properties": [
{
"id": "custom.cellOptions",
"value": {
"mode": "gradient",
"type": "color-background"
}
},
{
"id": "thresholds",
"value": {
"mode": "absolute",
"steps": [
{
"color": "green",
"value": null
},
{
"color": "red",
"value": 1
}
]
}
}
]
}
]
},
"gridPos": {
"h": 14,
"w": 24,
"x": 0,
"y": 0
},
"id": 2,
"options": {
"cellHeight": "sm",
"footer": {
"countRows": false,
"fields": "",
"reducer": [
"sum"
],
"show": false
},
"frameIndex": 1,
"showHeader": true,
"sortBy": []
},
"pluginVersion": "11.4.0",
"targets": [
{
"disableTextWrap": false,
"editorMode": "code",
"exemplar": false,
"expr": "sum by(instance) (podcheck_images_analyzed)",
"format": "table",
"fullMetaSearch": false,
"hide": false,
"includeNullMetadata": true,
"instant": true,
"interval": "",
"legendFormat": "{{instance}}",
"range": false,
"refId": "podcheck_images_analyzed",
"useBackend": false,
"datasource": {
"type": "prometheus",
"uid": "${DS_PROMETHEUS}"
}
},
{
"datasource": {
"type": "prometheus",
"uid": "${DS_PROMETHEUS}"
},
"disableTextWrap": false,
"editorMode": "code",
"exemplar": false,
"expr": "sum by(instance) (podcheck_images_outdated)",
"format": "table",
"fullMetaSearch": false,
"hide": false,
"includeNullMetadata": true,
"instant": true,
"legendFormat": "{{instance}}",
"range": false,
"refId": "podcheck_images_outdated",
"useBackend": false
},
{
"datasource": {
"type": "prometheus",
"uid": "${DS_PROMETHEUS}"
},
"disableTextWrap": false,
"editorMode": "code",
"exemplar": false,
"expr": "sum by(instance) (podcheck_images_latest)",
"format": "table",
"fullMetaSearch": false,
"hide": false,
"includeNullMetadata": true,
"instant": true,
"legendFormat": "{{instance}}",
"range": false,
"refId": "podcheck_images_latest",
"useBackend": false
},
{
"datasource": {
"type": "prometheus",
"uid": "${DS_PROMETHEUS}"
},
"editorMode": "code",
"exemplar": false,
"expr": "sum by(instance) (podcheck_images_error)",
"format": "table",
"hide": false,
"instant": true,
"legendFormat": "{{instance}}",
"range": false,
"refId": "podcheck_images_error"
},
{
"datasource": {
"type": "prometheus",
"uid": "${DS_PROMETHEUS}"
},
"editorMode": "code",
"exemplar": false,
"expr": "podcheck_images_analyze_timestamp_seconds * 1000",
"format": "table",
"hide": false,
"instant": true,
"legendFormat": "{{instance}}",
"range": false,
"refId": "podcheck_images_analyze_timestamp_seconds"
},
{
"datasource": {
"type": "prometheus",
"uid": "${DS_PROMETHEUS}"
},
"editorMode": "code",
"exemplar": false,
"expr": "time() - podcheck_images_analyze_timestamp_seconds",
"format": "table",
"hide": false,
"instant": true,
"legendFormat": "{{instance}}",
"range": false,
"refId": "podcheck_images_last_analyze"
}
],
"title": "podcheck Status",
"transformations": [
{
"id": "merge",
"options": {}
},
{
"id": "organize",
"options": {
"excludeByName": {
"Time": true,
"__name__": true,
"job": true
},
"includeByName": {},
"indexByName": {
"Time": 0,
"Value #podcheck_images_analyze_timestamp_seconds": 2,
"Value #podcheck_images_analyzed": 4,
"Value #podcheck_images_error": 7,
"Value #podcheck_images_last_analyze": 3,
"Value #podcheck_images_latest": 5,
"Value #podcheck_images_outdated": 6,
"instance": 1,
"job": 8
},
"renameByName": {
"Value #A": "analyze_timestamp",
"Value #podcheck_images_analyze_timestamp_seconds": "last_analyze_timestamp",
"Value #podcheck_images_analyzed": "images_analyzed",
"Value #podcheck_images_error": "images_error",
"Value #podcheck_images_last_analyze": "last_analyze_since",
"Value #podcheck_images_latest": "images_latest",
"Value #podcheck_images_outdated": "images_outdated"
}
}
}
],
"type": "table"
}
],
"schemaVersion": 40,
"tags": [],
"templating": {
"list": []
},
"time": {
"from": "now-6h",
"to": "now"
},
"timepicker": {},
"timezone": "browser",
"title": "podcheck Status",
"uid": "feb4pv3kv1hxca",
"version": 17,
"weekStart": ""
}

Binary file not shown.

After

Width:  |  Height:  |  Size: 50 KiB

View file

@ -0,0 +1,62 @@
#!/usr/bin/env bash
# prometheus_collector.sh - Exports detailed update metrics for Prometheus node_exporter.
#
# This script generates metrics about the state of Podman container update checks.
# It is designed to be sourced by podcheck.sh and then invoked with:
#
# prometheus_exporter <num_no_updates> <num_updates> <num_errors> <total_containers> <check_duration_seconds>
#
# Metrics:
# podcheck_no_updates:
# Number of containers that are already on the latest image.
# podcheck_updates:
# Number of containers with updates available.
# podcheck_errors:
# Number of containers that encountered errors during the update check.
# podcheck_total:
# Total number of containers checked.
# podcheck_check_duration:
# Duration (in seconds) it took to perform the update check.
# podcheck_last_check_timestamp:
# Epoch timestamp when the update check was performed.
#
# The metrics are written to a file named podcheck.prom in the specified
# CollectorTextFileDirectory, or /tmp if not specified.
#
prometheus_exporter() {
local no_updates="$1"
local updates="$2"
local errors="$3"
local total="$4"
local check_duration="$5"
local collector_dir="${CollectorTextFileDirectory:-/tmp}"
local last_check_timestamp
last_check_timestamp=$(date +%s)
{
echo "# HELP podcheck_no_updates Number of containers already on latest image."
echo "# TYPE podcheck_no_updates gauge"
echo "podcheck_no_updates $no_updates"
echo "# HELP podcheck_updates Number of containers with updates available."
echo "# TYPE podcheck_updates gauge"
echo "podcheck_updates $updates"
echo "# HELP podcheck_errors Number of containers with errors during update check."
echo "# TYPE podcheck_errors gauge"
echo "podcheck_errors $errors"
echo "# HELP podcheck_total Total number of containers checked."
echo "# TYPE podcheck_total gauge"
echo "podcheck_total $total"
echo "# HELP podcheck_check_duration Duration in seconds for the update check."
echo "# TYPE podcheck_check_duration gauge"
echo "podcheck_check_duration $check_duration"
echo "# HELP podcheck_last_check_timestamp Epoch timestamp of the last update check."
echo "# TYPE podcheck_last_check_timestamp gauge"
echo "podcheck_last_check_timestamp $last_check_timestamp"
} > "$collector_dir/podcheck.prom"
}