mirror of
https://github.com/mag37/dockcheck.git
synced 2026-02-15 15:58:13 +01:00
Upstream patches and additional patching (#2)
* Ensures DSM GUI refreshes its updates
* Removed whale icon and changed verbosity
* Added addon for Prometheus+node_exporter
* Changed local image check to check on image ID rather than name
* Update podcheck.sh
changed docker->podman, typo
* - **v0.6.0**:
- **Grafana & Prometheus Integration:**
- Added a detailed Prometheus metrics exporter that now reports not only the number of containers with updates, no-updates, and errors, but also the total number of containers checked, the duration of the update check, and the epoch timestamp of the last check.
- Enhanced documentation with instructions on integrating these metrics with Grafana for visual monitoring.
- **Improved Error Handling & Code Refactoring:**
- Introduced `set -euo pipefail` and local variable scoping within functions to improve reliability and prevent unexpected behaviour.
- Standardised container name handling and refined the Quadlet detection logic.
- **Self-Update Enhancements:**
- Updated the self-update mechanism to support both Git-based and HTTP-based updates, with an automatic restart that preserves the original arguments.
- **Miscellaneous Improvements:**
- Enhanced dependency installer to support both package manager and static binary installations for `jq` and `regctl`.
- General code refactoring across the project for better readability and maintainability.
* Update podcheck.sh
* increment version
* Update Quadlet detection logic
Update Quadlet detection logic to support flexible service naming
- Modified the quadlet update block to first try an exact match for "$i.service".
- If no exact match is found, build a regex pattern from the container name (allowing underscores and hyphens interchangeably) and search user service units.
- When multiple candidate units are found, the script attempts to choose the one that exactly matches (ignoring case) or defaults to the first candidate.
- This update allows containers like "containera" to match service units named "container_a.service" and supports scenarios with multiple counterparts (e.g., matrix-a, matrix-b, matrix_db).
* search name fix
* fixes to arg parsing
* Logic overhaul, verbose output and better syntax
* Added support for prometheus
---------
Co-authored-by: mag37 <robin.ivehult@gmail.com>
This commit is contained in:
parent
053c587bf5
commit
a7dcd975b2
12 changed files with 953 additions and 245 deletions
61
addons/prometheus/README.md
Normal file
61
addons/prometheus/README.md
Normal file
|
|
@ -0,0 +1,61 @@
|
|||
## [Prometheus](https://github.com/prometheus/prometheus) and [node_exporter](https://github.com/prometheus/node_exporter)
|
||||
Podcheck check is capable to export metrics to prometheus via the text file collector provided by the node_exporter.
|
||||
In order to do so the -c flag has to be specified followed by the file path that is configured in the text file collector of the node_exporter.
|
||||
A simple cron job can be configured to export these metrics on a regular interval as shown in the sample below:
|
||||
|
||||
```
|
||||
0 1 * * * /root/podcheck.sh -n -c /var/lib/node_exporter/textfile_collector
|
||||
```
|
||||
|
||||
The following metrics are exported to prometheus
|
||||
|
||||
```
|
||||
# HELP podcheck_images_analyzed Podman images that have been analyzed
|
||||
# TYPE podcheck_images_analyzed gauge
|
||||
podcheck_images_analyzed 22
|
||||
# HELP podcheck_images_outdated Podman images that are outdated
|
||||
# TYPE podcheck_images_outdated gauge
|
||||
podcheck_images_outdated 7
|
||||
# HELP podcheck_images_latest Podman images that are outdated
|
||||
# TYPE podcheck_images_latest gauge
|
||||
podcheck_images_latest 14
|
||||
# HELP podcheck_images_error Podman images with analysis errors
|
||||
# TYPE podcheck_images_error gauge
|
||||
podcheck_images_error 1
|
||||
# HELP podcheck_images_analyze_timestamp_seconds Last podcheck run time
|
||||
# TYPE podcheck_images_analyze_timestamp_seconds gauge
|
||||
podcheck_images_analyze_timestamp_seconds 1737924029
|
||||
```
|
||||
|
||||
Once those metrics are exported they can be used to define alarms as shown below
|
||||
|
||||
```
|
||||
- alert: podcheck_images_outdated
|
||||
expr: sum by(instance) (podcheck_images_outdated) > 0
|
||||
for: 15s
|
||||
labels:
|
||||
severity: warning
|
||||
annotations:
|
||||
summary: "{{ $labels.instance }} has {{ $value }} outdated podman images."
|
||||
description: "{{ $labels.instance }} has {{ $value }} outdated podman images."
|
||||
- alert: podcheck_images_error
|
||||
expr: sum by(instance) (podcheck_images_error) > 0
|
||||
for: 15s
|
||||
labels:
|
||||
severity: warning
|
||||
annotations:
|
||||
summary: "{{ $labels.instance }} has {{ $value }} podman images having an error."
|
||||
description: "{{ $labels.instance }} has {{ $value }} podman images having an error."
|
||||
- alert: podcheck_image_last_analyze
|
||||
expr: (time() - podcheck_images_analyze_timestamp_seconds) > (3600 * 24 * 3)
|
||||
for: 15s
|
||||
labels:
|
||||
severity: warning
|
||||
annotations:
|
||||
summary: "{{ $labels.instance }} has not updated the podcheck statistics for more than 3 days."
|
||||
description: "{{ $labels.instance }} has not updated the podcheck statistics for more than 3 days."
|
||||
```
|
||||
|
||||
There is a reference Grafana dashboard in [grafana/grafana_dashboard.json](./grafana/grafana_dashboard.json).
|
||||
|
||||

|
||||
382
addons/prometheus/grafana/grafana_dashboard.json
Normal file
382
addons/prometheus/grafana/grafana_dashboard.json
Normal file
|
|
@ -0,0 +1,382 @@
|
|||
{
|
||||
"__inputs": [
|
||||
{
|
||||
"name": "DS_PROMETHEUS",
|
||||
"label": "prometheus",
|
||||
"description": "",
|
||||
"type": "datasource",
|
||||
"pluginId": "prometheus",
|
||||
"pluginName": "Prometheus"
|
||||
}
|
||||
],
|
||||
"__elements": {},
|
||||
"__requires": [
|
||||
{
|
||||
"type": "grafana",
|
||||
"id": "grafana",
|
||||
"name": "Grafana",
|
||||
"version": "11.4.0"
|
||||
},
|
||||
{
|
||||
"type": "datasource",
|
||||
"id": "prometheus",
|
||||
"name": "Prometheus",
|
||||
"version": "1.0.0"
|
||||
},
|
||||
{
|
||||
"type": "panel",
|
||||
"id": "table",
|
||||
"name": "Table",
|
||||
"version": ""
|
||||
}
|
||||
],
|
||||
"annotations": {
|
||||
"list": [
|
||||
{
|
||||
"builtIn": 1,
|
||||
"datasource": {
|
||||
"type": "grafana",
|
||||
"uid": "-- Grafana --"
|
||||
},
|
||||
"enable": true,
|
||||
"hide": true,
|
||||
"iconColor": "rgba(0, 211, 255, 1)",
|
||||
"name": "Annotations & Alerts",
|
||||
"type": "dashboard"
|
||||
}
|
||||
]
|
||||
},
|
||||
"editable": true,
|
||||
"fiscalYearStartMonth": 0,
|
||||
"graphTooltip": 0,
|
||||
"id": null,
|
||||
"links": [],
|
||||
"panels": [
|
||||
{
|
||||
"datasource": {
|
||||
"type": "prometheus",
|
||||
"uid": "${DS_PROMETHEUS}"
|
||||
},
|
||||
"fieldConfig": {
|
||||
"defaults": {
|
||||
"color": {
|
||||
"mode": "thresholds"
|
||||
},
|
||||
"custom": {
|
||||
"align": "auto",
|
||||
"cellOptions": {
|
||||
"type": "auto"
|
||||
},
|
||||
"inspect": false
|
||||
},
|
||||
"mappings": [],
|
||||
"thresholds": {
|
||||
"mode": "absolute",
|
||||
"steps": [
|
||||
{
|
||||
"color": "green",
|
||||
"value": null
|
||||
},
|
||||
{
|
||||
"color": "red",
|
||||
"value": 80
|
||||
}
|
||||
]
|
||||
}
|
||||
},
|
||||
"overrides": [
|
||||
{
|
||||
"matcher": {
|
||||
"id": "byName",
|
||||
"options": "last_analyze_timestamp"
|
||||
},
|
||||
"properties": [
|
||||
{
|
||||
"id": "unit",
|
||||
"value": "dateTimeAsIso"
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"matcher": {
|
||||
"id": "byName",
|
||||
"options": "last_analyze_since"
|
||||
},
|
||||
"properties": [
|
||||
{
|
||||
"id": "unit",
|
||||
"value": "s"
|
||||
},
|
||||
{
|
||||
"id": "custom.cellOptions",
|
||||
"value": {
|
||||
"mode": "gradient",
|
||||
"type": "color-background"
|
||||
}
|
||||
},
|
||||
{
|
||||
"id": "thresholds",
|
||||
"value": {
|
||||
"mode": "absolute",
|
||||
"steps": [
|
||||
{
|
||||
"color": "green",
|
||||
"value": null
|
||||
},
|
||||
{
|
||||
"color": "red",
|
||||
"value": 259200
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"matcher": {
|
||||
"id": "byName",
|
||||
"options": "images_outdated"
|
||||
},
|
||||
"properties": [
|
||||
{
|
||||
"id": "custom.cellOptions",
|
||||
"value": {
|
||||
"mode": "gradient",
|
||||
"type": "color-background"
|
||||
}
|
||||
},
|
||||
{
|
||||
"id": "thresholds",
|
||||
"value": {
|
||||
"mode": "absolute",
|
||||
"steps": [
|
||||
{
|
||||
"color": "green",
|
||||
"value": null
|
||||
},
|
||||
{
|
||||
"color": "red",
|
||||
"value": 1
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"matcher": {
|
||||
"id": "byName",
|
||||
"options": "images_error"
|
||||
},
|
||||
"properties": [
|
||||
{
|
||||
"id": "custom.cellOptions",
|
||||
"value": {
|
||||
"mode": "gradient",
|
||||
"type": "color-background"
|
||||
}
|
||||
},
|
||||
{
|
||||
"id": "thresholds",
|
||||
"value": {
|
||||
"mode": "absolute",
|
||||
"steps": [
|
||||
{
|
||||
"color": "green",
|
||||
"value": null
|
||||
},
|
||||
{
|
||||
"color": "red",
|
||||
"value": 1
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
},
|
||||
"gridPos": {
|
||||
"h": 14,
|
||||
"w": 24,
|
||||
"x": 0,
|
||||
"y": 0
|
||||
},
|
||||
"id": 2,
|
||||
"options": {
|
||||
"cellHeight": "sm",
|
||||
"footer": {
|
||||
"countRows": false,
|
||||
"fields": "",
|
||||
"reducer": [
|
||||
"sum"
|
||||
],
|
||||
"show": false
|
||||
},
|
||||
"frameIndex": 1,
|
||||
"showHeader": true,
|
||||
"sortBy": []
|
||||
},
|
||||
"pluginVersion": "11.4.0",
|
||||
"targets": [
|
||||
{
|
||||
"disableTextWrap": false,
|
||||
"editorMode": "code",
|
||||
"exemplar": false,
|
||||
"expr": "sum by(instance) (podcheck_images_analyzed)",
|
||||
"format": "table",
|
||||
"fullMetaSearch": false,
|
||||
"hide": false,
|
||||
"includeNullMetadata": true,
|
||||
"instant": true,
|
||||
"interval": "",
|
||||
"legendFormat": "{{instance}}",
|
||||
"range": false,
|
||||
"refId": "podcheck_images_analyzed",
|
||||
"useBackend": false,
|
||||
"datasource": {
|
||||
"type": "prometheus",
|
||||
"uid": "${DS_PROMETHEUS}"
|
||||
}
|
||||
},
|
||||
{
|
||||
"datasource": {
|
||||
"type": "prometheus",
|
||||
"uid": "${DS_PROMETHEUS}"
|
||||
},
|
||||
"disableTextWrap": false,
|
||||
"editorMode": "code",
|
||||
"exemplar": false,
|
||||
"expr": "sum by(instance) (podcheck_images_outdated)",
|
||||
"format": "table",
|
||||
"fullMetaSearch": false,
|
||||
"hide": false,
|
||||
"includeNullMetadata": true,
|
||||
"instant": true,
|
||||
"legendFormat": "{{instance}}",
|
||||
"range": false,
|
||||
"refId": "podcheck_images_outdated",
|
||||
"useBackend": false
|
||||
},
|
||||
{
|
||||
"datasource": {
|
||||
"type": "prometheus",
|
||||
"uid": "${DS_PROMETHEUS}"
|
||||
},
|
||||
"disableTextWrap": false,
|
||||
"editorMode": "code",
|
||||
"exemplar": false,
|
||||
"expr": "sum by(instance) (podcheck_images_latest)",
|
||||
"format": "table",
|
||||
"fullMetaSearch": false,
|
||||
"hide": false,
|
||||
"includeNullMetadata": true,
|
||||
"instant": true,
|
||||
"legendFormat": "{{instance}}",
|
||||
"range": false,
|
||||
"refId": "podcheck_images_latest",
|
||||
"useBackend": false
|
||||
},
|
||||
{
|
||||
"datasource": {
|
||||
"type": "prometheus",
|
||||
"uid": "${DS_PROMETHEUS}"
|
||||
},
|
||||
"editorMode": "code",
|
||||
"exemplar": false,
|
||||
"expr": "sum by(instance) (podcheck_images_error)",
|
||||
"format": "table",
|
||||
"hide": false,
|
||||
"instant": true,
|
||||
"legendFormat": "{{instance}}",
|
||||
"range": false,
|
||||
"refId": "podcheck_images_error"
|
||||
},
|
||||
{
|
||||
"datasource": {
|
||||
"type": "prometheus",
|
||||
"uid": "${DS_PROMETHEUS}"
|
||||
},
|
||||
"editorMode": "code",
|
||||
"exemplar": false,
|
||||
"expr": "podcheck_images_analyze_timestamp_seconds * 1000",
|
||||
"format": "table",
|
||||
"hide": false,
|
||||
"instant": true,
|
||||
"legendFormat": "{{instance}}",
|
||||
"range": false,
|
||||
"refId": "podcheck_images_analyze_timestamp_seconds"
|
||||
},
|
||||
{
|
||||
"datasource": {
|
||||
"type": "prometheus",
|
||||
"uid": "${DS_PROMETHEUS}"
|
||||
},
|
||||
"editorMode": "code",
|
||||
"exemplar": false,
|
||||
"expr": "time() - podcheck_images_analyze_timestamp_seconds",
|
||||
"format": "table",
|
||||
"hide": false,
|
||||
"instant": true,
|
||||
"legendFormat": "{{instance}}",
|
||||
"range": false,
|
||||
"refId": "podcheck_images_last_analyze"
|
||||
}
|
||||
],
|
||||
"title": "podcheck Status",
|
||||
"transformations": [
|
||||
{
|
||||
"id": "merge",
|
||||
"options": {}
|
||||
},
|
||||
{
|
||||
"id": "organize",
|
||||
"options": {
|
||||
"excludeByName": {
|
||||
"Time": true,
|
||||
"__name__": true,
|
||||
"job": true
|
||||
},
|
||||
"includeByName": {},
|
||||
"indexByName": {
|
||||
"Time": 0,
|
||||
"Value #podcheck_images_analyze_timestamp_seconds": 2,
|
||||
"Value #podcheck_images_analyzed": 4,
|
||||
"Value #podcheck_images_error": 7,
|
||||
"Value #podcheck_images_last_analyze": 3,
|
||||
"Value #podcheck_images_latest": 5,
|
||||
"Value #podcheck_images_outdated": 6,
|
||||
"instance": 1,
|
||||
"job": 8
|
||||
},
|
||||
"renameByName": {
|
||||
"Value #A": "analyze_timestamp",
|
||||
"Value #podcheck_images_analyze_timestamp_seconds": "last_analyze_timestamp",
|
||||
"Value #podcheck_images_analyzed": "images_analyzed",
|
||||
"Value #podcheck_images_error": "images_error",
|
||||
"Value #podcheck_images_last_analyze": "last_analyze_since",
|
||||
"Value #podcheck_images_latest": "images_latest",
|
||||
"Value #podcheck_images_outdated": "images_outdated"
|
||||
}
|
||||
}
|
||||
}
|
||||
],
|
||||
"type": "table"
|
||||
}
|
||||
],
|
||||
"schemaVersion": 40,
|
||||
"tags": [],
|
||||
"templating": {
|
||||
"list": []
|
||||
},
|
||||
"time": {
|
||||
"from": "now-6h",
|
||||
"to": "now"
|
||||
},
|
||||
"timepicker": {},
|
||||
"timezone": "browser",
|
||||
"title": "podcheck Status",
|
||||
"uid": "feb4pv3kv1hxca",
|
||||
"version": 17,
|
||||
"weekStart": ""
|
||||
}
|
||||
BIN
addons/prometheus/grafana/grafana_dashboard.png
Normal file
BIN
addons/prometheus/grafana/grafana_dashboard.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 50 KiB |
62
addons/prometheus/prometheus_collector.sh
Normal file
62
addons/prometheus/prometheus_collector.sh
Normal file
|
|
@ -0,0 +1,62 @@
|
|||
#!/usr/bin/env bash
|
||||
# prometheus_collector.sh - Exports detailed update metrics for Prometheus node_exporter.
|
||||
#
|
||||
# This script generates metrics about the state of Podman container update checks.
|
||||
# It is designed to be sourced by podcheck.sh and then invoked with:
|
||||
#
|
||||
# prometheus_exporter <num_no_updates> <num_updates> <num_errors> <total_containers> <check_duration_seconds>
|
||||
#
|
||||
# Metrics:
|
||||
# podcheck_no_updates:
|
||||
# Number of containers that are already on the latest image.
|
||||
# podcheck_updates:
|
||||
# Number of containers with updates available.
|
||||
# podcheck_errors:
|
||||
# Number of containers that encountered errors during the update check.
|
||||
# podcheck_total:
|
||||
# Total number of containers checked.
|
||||
# podcheck_check_duration:
|
||||
# Duration (in seconds) it took to perform the update check.
|
||||
# podcheck_last_check_timestamp:
|
||||
# Epoch timestamp when the update check was performed.
|
||||
#
|
||||
# The metrics are written to a file named podcheck.prom in the specified
|
||||
# CollectorTextFileDirectory, or /tmp if not specified.
|
||||
#
|
||||
|
||||
prometheus_exporter() {
|
||||
local no_updates="$1"
|
||||
local updates="$2"
|
||||
local errors="$3"
|
||||
local total="$4"
|
||||
local check_duration="$5"
|
||||
local collector_dir="${CollectorTextFileDirectory:-/tmp}"
|
||||
local last_check_timestamp
|
||||
last_check_timestamp=$(date +%s)
|
||||
|
||||
{
|
||||
echo "# HELP podcheck_no_updates Number of containers already on latest image."
|
||||
echo "# TYPE podcheck_no_updates gauge"
|
||||
echo "podcheck_no_updates $no_updates"
|
||||
|
||||
echo "# HELP podcheck_updates Number of containers with updates available."
|
||||
echo "# TYPE podcheck_updates gauge"
|
||||
echo "podcheck_updates $updates"
|
||||
|
||||
echo "# HELP podcheck_errors Number of containers with errors during update check."
|
||||
echo "# TYPE podcheck_errors gauge"
|
||||
echo "podcheck_errors $errors"
|
||||
|
||||
echo "# HELP podcheck_total Total number of containers checked."
|
||||
echo "# TYPE podcheck_total gauge"
|
||||
echo "podcheck_total $total"
|
||||
|
||||
echo "# HELP podcheck_check_duration Duration in seconds for the update check."
|
||||
echo "# TYPE podcheck_check_duration gauge"
|
||||
echo "podcheck_check_duration $check_duration"
|
||||
|
||||
echo "# HELP podcheck_last_check_timestamp Epoch timestamp of the last update check."
|
||||
echo "# TYPE podcheck_last_check_timestamp gauge"
|
||||
echo "podcheck_last_check_timestamp $last_check_timestamp"
|
||||
} > "$collector_dir/podcheck.prom"
|
||||
}
|
||||
Loading…
Add table
Add a link
Reference in a new issue