kubectl cnpg status -n cnpg-keycloak cnpg-keycloak
These instructions are intended for use with the setup described in the Concepts for single-cluster deployments guide. Use it together with the other building blocks outlined in the Building blocks single-cluster deployments guide.
| We provide these blueprints to show a minimal functionally complete example with a good baseline performance for regular installations. You would still need to adapt it to your environment and your organization’s standards and security best practices. |
In a CloudNativePG cluster deployed in high availability mode, standby instances are critical for both data durability and failover readiness. Monitoring standby health helps detect replication issues early and ensures a safe promotion candidate is available when needed.
A CloudNativePG cluster deployed according to steps described in the Deploying CloudNativePG in multiple availability zones guide.
To see the status on the command line:
The kubectl command-line utility.
The kubectl cnpg plugin.
Please follow the CloudNativePG documentation for installation steps.
To monitor the status via metrics and dashboards:
Prometheus and Grafana installed on the Kubernetes cluster.
Review the status of the CloudNativePG cluster using the kubectl cnpg status command.
kubectl cnpg status -n cnpg-keycloak cnpg-keycloak
Cluster Summary
Name cnpg-keycloak/cnpg-keycloak
System ID: *******************
PostgreSQL Image: ghcr.io/cloudnative-pg/postgresql:18.3-system-trixie
Primary instance: cnpg-keycloak-1
Primary promotion time: 2026-04-13 16:02:05 +0000 UTC (1h10m27s)
Status: Cluster in healthy state (1)
Instances: 3
Ready instances: 3
Size: 128M
Current Write LSN: 0/7000000 (Timeline: 1 - WAL File: 000000010000000000000007)
Continuous Backup status (Barman Cloud Plugin) (2)
ObjectStore / Server name: cnpg-store/cnpg-keycloak
First Point of Recoverability: 2026-04-13 16:07:54 UTC
Last Successful Backup: 2026-04-13 17:00:04 UTC
Last Failed Backup: -
Working WAL archiving: OK
WALs waiting to be archived: 0
Last Archived WAL: 000000010000000000000006 @ 2026-04-13T16:08:15.350313Z
Last Failed WAL: -
Streaming Replication status (3)
Replication Slots Enabled
Name Sent LSN Write LSN Flush LSN Replay LSN Write Lag Flush Lag Replay Lag State Sync State Sync Priority Replication Slot
---- -------- --------- --------- ---------- --------- --------- ---------- ----- ---------- ------------- ----------------
cnpg-keycloak-2 0/7000000 0/7000000 0/7000000 0/7000000 00:00:00.000438 00:00:00.00148 00:00:00.00148 streaming quorum 1 active
cnpg-keycloak-3 0/7000000 0/7000000 0/7000000 0/7000000 00:00:00.000722 00:00:00.0017 00:00:00.0017 streaming quorum 1 active
Instances status (4)
Name Current LSN Replication role Status QoS Manager Version Node
---- ----------- ---------------- ------ --- --------------- ----
cnpg-keycloak-1 0/7000000 Primary OK BestEffort 1.29.0 ⋯
cnpg-keycloak-2 0/7000000 Standby (sync) OK BestEffort 1.29.0 ⋯
cnpg-keycloak-3 0/7000000 Standby (sync) OK BestEffort 1.29.0 ⋯
Plugins status
Name Version Status Reported Operator Capabilities
---- ------- ------ ------------------------------
barman-cloud.cloudnative-pg.io 0.11.0 N/A Reconciler Hooks, Lifecycle Service
| 1 | The cluster status should read Cluster in healthy state.
Any other value indicates a problem. |
| 2 | This section shows the status of the cluster’s backups, if configured. |
| 3 | This section shows the status of the cluster’s standby instances and their replication health. It is based on the pg_stat_replication system view available on the primary node. |
| 4 | General status of individual instances and their roles in the cluster. |
Verify standby health in the Streaming Replication status table.
The following fields help determine whether standbys are healthy and replication is working:
| Field | Expected value | What it means |
|---|---|---|
|
A two-part hexadecimal value like |
A Log Sequence Number (LSN) is a pointer to a position in the Write-Ahead Log (WAL) stream. The The The difference between a standby’s |
|
|
Replication lag metrics. A non-zero value that grows over time indicates the standby is falling behind. |
|
|
Current WAL sender state. Possible values are:
|
|
|
Synchronous state of this standby server. Possible values are:
|
|
|
Confirms the replication slot is in use and the standby is consuming WAL. |
Enable metric collection by creating a PodMonitor resource:
kubectl -n cnpg-keycloak apply -f - <<EOF
apiVersion: monitoring.coreos.com/v1
kind: PodMonitor
metadata:
name: cnpg-keycloak-pod-monitor
spec:
selector:
matchLabels:
cnpg.io/cluster: cnpg-keycloak (1)
podMetricsEndpoints:
- port: metrics
EOF
| 1 | Name of the CloudNativePG cluster to be monitored. |
Add the grafana-dashboard.json from the cloudnative-pg/grafana-dashboards GitHub project to your Grafana instance.
Optionally, customize the monitoring according to the Monitoring section of the CloudNativePG documentation.
Use the following metrics to observe standby health:
| Metric | Description |
|---|---|
|
Replication lag in seconds per standby instance. A value near |
|
Returns |
|
Returns |
|
Time elapsed between WAL flushed on the primary and received by the standby. |
|
Time elapsed between WAL flushed on the primary and replayed on the standby. |
If backups are enabled, use the metrics exposed by the Barman Cloud Plugin to monitor their status:
| Metric | Description |
|---|---|
|
UNIX timestamp of the most recent successful backup. |
|
UNIX timestamp of the most recent failed backup attempt. |
|
UNIX timestamp of the earliest point in time available for cluster recovery. |
A healthy standby setup typically shows:
Cluster status is Cluster in healthy state.
All standby instances show State: streaming.
Write Lag, Flush Lag, and Replay Lag are low and stable, with no continuous upward trend.
At least one standby has Sync State: quorum (for quorum-based synchronous replication as described in the Deploying CloudNativePG in multiple availability zones guide).
cnpg_pg_replication_in_recovery is 1 for all standby instances in Prometheus.
The following are indicators that a standby requires attention:
The cluster Status is not Cluster in healthy state.
A standby State is not streaming.
Any of Write Lag, Flush Lag, or Replay Lag is continuously increasing over time.
No standby is in quorum or sync state when synchronous replication is expected.
A standby is missing from the Streaming Replication status table.
cnpg_pg_replication_in_recovery is 0 for any instance that is expected to be a standby in Prometheus.
If one or more standby instances show these symptoms, investigate using the following commands:
Verify that the standby pods are running:
kubectl -n cnpg-keycloak get pods -L role
Check recent events in the namespace for scheduling, image pull, storage, or networking problems:
kubectl -n cnpg-keycloak get events --sort-by=.lastTimestamp | tail -n 30
Inspect the CloudNativePG cluster resource for conditions and related messages:
kubectl -n cnpg-keycloak describe cluster cnpg-keycloak
For possible troubleshooting scenarios refer to the CloudNativePG documentation.
To perform a manual switchover after confirming standby readiness, see the CloudNativePG Switchover Procedure guide.