Skip to content

release-26.1: roachtest/clusterstats: ignore PromQL info-level annotations#171093

Merged
trunk-io[bot] merged 1 commit into
release-26.1from
blathers/backport-release-26.1-170947
May 29, 2026
Merged

release-26.1: roachtest/clusterstats: ignore PromQL info-level annotations#171093
trunk-io[bot] merged 1 commit into
release-26.1from
blathers/backport-release-26.1-170947

Conversation

@blathers-crl

@blathers-crl blathers-crl Bot commented May 28, 2026

Copy link
Copy Markdown

Backport 1/1 commits from #170947 on behalf of @miraradeva.


CollectPoint and CollectInterval treated any non-empty Prometheus
warnings slice as a fatal error. With Prometheus 2.53+, this slice
now includes informational annotations like

PromQL info: metric might not be a counter, name does not end in
_total/_sum/_count/_bucket: "sys_host_disk_write_bytes"

which is emitted for queries such as rate(sys_host_disk_write_bytes[1m])
and is purely a naming-convention nudge -- the query result is still
valid. After the recent bump to Prometheus 2.53.5 (#170676), this
broke admission-control/disk-bandwidth-limiter outright (it t.Fatals
on collection errors) and dropped bandwidth samples in admission-
control/{index,single-node-index}-backfill, which already log the
error and continue.

Extract a handlePromWarnings helper that classifies entries by the
"PromQL info:" wire-format prefix:

  • "PromQL info: ..." entries are logged and dropped.
  • "PromQL warning: ..." entries, and any unprefixed entries (e.g.
    legacy remote-read warnings, which predate the PromQL annotation
    system), continue to surface as the same error string as before.

Prometheus' annotations package defines exactly two sentinel error
types (PromQLInfo and PromQLWarning), so the prefix check is the
actual wire-format contract; the client_golang v1 API we use flattens
both into a single []string with no structured alternative.

Resolves: #170841
See also: #170790, #170793, #170843

Release note: None


Release justification: test-only fix

CollectPoint and CollectInterval treated any non-empty Prometheus
warnings slice as a fatal error. With Prometheus 2.53+, this slice
now includes informational annotations like

  PromQL info: metric might not be a counter, name does not end in
  _total/_sum/_count/_bucket: "sys_host_disk_write_bytes"

which is emitted for queries such as rate(sys_host_disk_write_bytes[1m])
and is purely a naming-convention nudge -- the query result is still
valid. After the recent bump to Prometheus 2.53.5 (#170676), this
broke admission-control/disk-bandwidth-limiter outright (it t.Fatals
on collection errors) and dropped bandwidth samples in admission-
control/{index,single-node-index}-backfill, which already log the
error and continue.

Extract a handlePromWarnings helper that classifies entries by the
"PromQL info:" wire-format prefix:

 - "PromQL info: ..." entries are logged and dropped.
 - "PromQL warning: ..." entries, and any unprefixed entries (e.g.
   legacy remote-read warnings, which predate the PromQL annotation
   system), continue to surface as the same error string as before.

Prometheus' annotations package defines exactly two sentinel error
types (PromQLInfo and PromQLWarning), so the prefix check is the
actual wire-format contract; the client_golang v1 API we use flattens
both into a single []string with no structured alternative.

Resolves: #170841
See also: #170790, #170793, #170843

Release note: None
@blathers-crl blathers-crl Bot force-pushed the blathers/backport-release-26.1-170947 branch from 561bc82 to d4a7c8f Compare May 28, 2026 09:14
@blathers-crl blathers-crl Bot requested a review from a team as a code owner May 28, 2026 09:14
@blathers-crl blathers-crl Bot requested review from cpj2195 and golgeek and removed request for a team May 28, 2026 09:14
@blathers-crl blathers-crl Bot added blathers-backport This is a backport that Blathers created automatically. O-robot Originated from a bot. labels May 28, 2026
@trunk-io

trunk-io Bot commented May 28, 2026

Copy link
Copy Markdown
Contributor

😎 Merged successfully - details.

@blathers-crl blathers-crl Bot requested a review from srosenberg May 28, 2026 09:14
@blathers-crl

blathers-crl Bot commented May 28, 2026

Copy link
Copy Markdown
Author

Thanks for opening a backport.

Before merging, please confirm that the change does not break backwards compatibility and otherwise complies with the backport policy. Include a brief release justification in the PR description explaining why the backport is appropriate. All backports must be reviewed by the TL for the owning area. While the stricter LTS policy does not yet apply, please exercise judgment and consider gating non-critical changes behind a disabled-by-default feature flag when appropriate.

@blathers-crl blathers-crl Bot added backport Label PR's that are backports to older release branches T-kv KV Team labels May 28, 2026
@cockroach-teamcity

Copy link
Copy Markdown
Member

This change is Reviewable

@blathers-crl blathers-crl Bot added the backport-test-only Used to denote the backport has only non-production changes label May 28, 2026
@trunk-io trunk-io Bot merged commit 72ca67d into release-26.1 May 29, 2026
22 checks passed
@trunk-io trunk-io Bot deleted the blathers/backport-release-26.1-170947 branch May 29, 2026 13:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport Label PR's that are backports to older release branches backport-test-only Used to denote the backport has only non-production changes blathers-backport This is a backport that Blathers created automatically. O-robot Originated from a bot. T-kv KV Team target-release-26.1.6

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants