Open
Conversation
NikolayS
reviewed
Sep 16, 2025
exporter/README.md
Outdated
| The exporter provides the following metrics: | ||
|
|
||
| ### Backup Metrics | ||
| - `walg_backup_lag_seconds{backup_type}` - Time since last backup-push in seconds |
There was a problem hiding this comment.
for all timestamps, let's specify clearly if it's timestamp of beginning of the process of end of it
NikolayS
reviewed
Sep 16, 2025
exporter/README.md
Outdated
|
|
||
| ### Backup Metrics | ||
| - `walg_backup_lag_seconds{backup_type}` - Time since last backup-push in seconds | ||
| - `walg_backup_count{backup_type}` - Number of backups (full/delta) |
There was a problem hiding this comment.
successful attempts only or all of them?
NikolayS
reviewed
Sep 16, 2025
exporter/README.md
Outdated
| - `walg_backup_timestamp{backup_type}` - Timestamp of last backup | ||
|
|
||
| ### WAL Metrics | ||
| - `walg_wal_lag_seconds{timeline}` - Time since last wal-push in seconds |
There was a problem hiding this comment.
Suggested change
| - `walg_wal_lag_seconds{timeline}` - Time since last wal-push in seconds | |
| - `walg_wal_lag_seconds{timeline}` - Time since last successful wal-push in seconds |
There was a problem hiding this comment.
Another question: "time since" is a derived metric. Isn't it better to export timestamps and let monitoring decide what to show to users/AI, raw timestamps or lag values (or both)?
NikolayS
reviewed
Sep 16, 2025
exporter/README.md
Outdated
| - `walg_wal_integrity_status{timeline}` - WAL integrity status (1 = OK, 0 = ERROR) | ||
|
|
||
| ### PITR Metrics | ||
| - `walg_pitr_window_seconds` - Point-in-time recovery window size in seconds |
There was a problem hiding this comment.
what if we have gaps / multiple windows?
…ed walg_backup_start_timestamp
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…from internal/databases/postgres/lsn.go
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Database name
PostgreSQL - This PR adds a Prometheus exporter for WAL-G PostgreSQL backup and WAL monitoring.
Pull request description
Describe what this PR adds
This PR introduces a WAL-G Prometheus Exporter that provides comprehensive observability for WAL-G backup operations for PostgreSQL databases.
🎯 What This PR Adds
This PR adds a complete Prometheus exporter (
/exporterdirectory) with the following capabilities:Core Exporter Components:
exporter.go- Main Prometheus collector implementationmain.go- HTTP server and CLI interfacepitr.go- Point-in-time recovery window calculationswal_lag.go- LSN parsing and WAL lag calculation logicmock-wal-g- Mock script for testing and developmentgo.mod/go.sum- Go module dependenciesKey Features:
📊 Backup Monitoring
_D_suffix naming convention to correctly distinguish full vs incremental backupsbase_backuplabel showing which full backup they're based on📈 WAL Stream Monitoring
🔍 Storage Health Monitoring
⏰ PITR & Recovery Monitoring
🔧 Operational Metrics
📊 Metrics Provided
Backup Metrics
Critical Labels:
backup_type:fullordelta(correctly determined by_D_suffix presence)base_backup: For incremental backups, shows which full backup they're based onbackup_name: Complete backup identifierWAL Metrics
Storage & Health Metrics
🔧 Technical Implementation Highlights
✅ Correct Backup Type Detection
One of the key technical achievements is accurate backup type classification:
The Problem: Naive implementations often mark ALL backups as "full" because they all start with
base_prefix.The Solution: This exporter correctly uses WAL-G's actual naming convention:
base_000000010000000000000025(no_D_suffix)base_000000010000000500000007_D_000000010000000000000025(contains_D_)⏱️ Dual Timestamp Architecture
walg_backup_start_timestamp- When backup operation startedwalg_backup_finish_timestamp- When backup completed successfully🧪 Comprehensive Testing Framework
🚀 Usage
Basic Usage
Configuration Options
Prometheus Integration
📈 Monitoring Examples
Backup Age Monitoring
Storage Health
🧪 Testing
Development Testing
Integration Testing
📋 Files Added
This PR adds the complete
/exporterdirectory with:exporter.go- Core Prometheus collector (466 lines)main.go- HTTP server and CLI interfacepitr.go- PITR window calculation logicwal_lag.go- LSN parsing and lag calculationmock-wal-g- Testing mock scriptREADME.md- Comprehensive documentationgo.mod/go.sum- Go module configuration🎯 Value Proposition
This exporter transforms WAL-G from a "black box" backup solution into a fully observable system:
🔗 Dependencies
The exporter requires:
--walg.path📚 Documentation
Complete documentation is provided in
/exporter/README.mdincluding: