Skip to content

Using getCanonicalName instead of toString for bean names to guarantee the order of key parameters in the bean name#597

Merged
gh-worker-dd-mergequeue-cf854d[bot] merged 4 commits intoDataDog:masterfrom
LMnet:bean-props-order
Apr 10, 2026
Merged

Conversation

@LMnet
Copy link
Copy Markdown
Contributor

@LMnet LMnet commented Mar 19, 2026

Problem

bean_regex matching in JMXFetch relies on ObjectName.toString() to produce the string that regexes are matched against.

Here is the code from the JmxAttribute class that highlights it:

this.beanStringName = beanName.toString(); // <-- creating `beanStringName` in constructor


private boolean matchBeanRegex(Filter filter, boolean matchIfNoRegex) {
    List<Pattern> beanRegexes = filter.getBeanRegexes();
    if (beanRegexes.isEmpty()) {
        return matchIfNoRegex;
    }

    for (Pattern beanRegex : beanRegexes) {
        Matcher matcher = beanRegex.matcher(beanStringName); // <-- matching

        if (matcher.matches()) {
            for (int i = 0; i <= matcher.groupCount(); i++) {
                this.beanParameters.put(Integer.toString(i), matcher.group(i));
            }
            return true;
        }
    }
    return false;
}

However, ObjectName.toString() does not guarantee the order of key properties in the MBean name.

This means, a regex like kafka.consumer:client-id=(.*),type=consumer-fetch-manager-metrics may match or fail depending on whether the JVM happens to serialize properties in that order. Any combination is possible:

kafka.consumer:client-id=(.*),type=consumer-fetch-manager-metrics
kafka.consumer:type=consumer-fetch-manager-metrics,client-id=(.*)

And with more key properties it's getting even more complicated.

So, at the moment users have no reliable way to write robust bean_regex patterns.

Solution

I added a new instance-level configuration parameter use_canonical_bean_name (default: false) that, when enabled, uses ObjectName.getCanonicalName() instead of toString() for bean name string representation. getCanonicalName() sorts key properties alphabetically, producing a deterministic order. With this, beans will always have the same string representation, which makes regexp matching and extraction stable and robust.

Configuration example:

instances:
  - host: localhost
    port: 9010
    use_canonical_bean_name: true  # <-- new flag
    conf:
      - include:
          bean_regex: "kafka.consumer:client-id=(.*),type=consumer-fetch-manager-metrics"

The main change is this:

this.beanStringName = useCanonicalBeanName
                ? beanName.getCanonicalName()
                : beanName.toString();

Depending on the flag, either old or new behavior will be enabled.

I hid this change behind the flag because this change may be potentially a breaking change for some users. If previously your config relied on a non-canonical string, enabling the new behavior will break regular expressions. Therefore, the default behavior is old behavior. However, I believe that the new behavior should be recommended. I already have a branch with the documentation changes where this new feature is explained, with the recommendation of using use_canonical_bean_name: true for all new setups.

Logging changes

Additionally to the main change explained above, I also replaced all other toString calls with getCanonicalName calls. In practice all these changes affected only logging messages.

I didn't hide them behind the flag, because it will make the code quite ugly and cumbersome: passing the flag everywhere, getting the right string based on the flag on each log call, it's kinda too much.

However, I'm not 100% sure about this part. I can imagine that one can rely on logging messages to figure out, what datadog agent is actually using for regexp matching. And if the use_canonical_bean_name flag is false, what you'll see in logs (always canonical strings) and what datadog will actually use for matching (toString output) may not match exactly.

I decided to procede with this change, because the new behavior is a lot more predictable, and users should enable use_canonical_bean_name if they want a stable regexp matching. Without it there is no guarantee, and this is the current behavior.

Tests

The new functionality is covered with new test cases:

  • testBeanRegexUseCanonicalBeanName – regex written in canonical (alphabetical) order matches when the flag is true.
  • testBeanRegexDontUseCanonicalBeanName – regex written in registration order matches when the flag is false (default)
  • testBeanRegexCanonicalOrderNotMatchingWithoutFlag – canonical-order regex does NOT match when the flag is false, confirming the flag controls behavior (negative scenario).

@LMnet LMnet marked this pull request as ready for review March 19, 2026 05:03
@LMnet LMnet requested a review from a team as a code owner March 19, 2026 05:03
@carlosroman
Copy link
Copy Markdown
Contributor

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

To use Codex here, create a Codex account and connect to github.

@carlosroman
Copy link
Copy Markdown
Contributor

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Keep it up!

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@carlosroman
Copy link
Copy Markdown
Contributor

carlosroman commented Mar 24, 2026

This PR looks good, I just have one concern on this. This might be a breaking change as people could be relying on the behaviour of their JVM. We've had something like this on our todo list as we wanted to support Canonical Name as tools like JMXTerm print bean names using that rather than toString(). I'm gonna see if I can test this with our existing integration test suite to see if anything breaks.

Edit:
Sorry, I meant more interested in how many integrations would break if enabling this and if we should move use_canonical_bean_name to be at the include/bean level rather than globally for the config (or support both). I can see other people wanting this but using it with our existing configs and ending up in a situation where they can't due to a mix of names.

@LMnet LMnet force-pushed the bean-props-order branch from ac01b55 to 048551a Compare March 25, 2026 21:54
…e the order of key parameters in the bean name
@LMnet LMnet force-pushed the bean-props-order branch from 048551a to 57d48bd Compare March 25, 2026 21:55
@LMnet
Copy link
Copy Markdown
Contributor Author

LMnet commented Mar 26, 2026

This might be a breaking change as people could be relying on the behaviour of their JVM.

That's exactly why I hid this change behind the flag. And by default the flag is false.

I believe that documentation should suggest enabling this flag if one wants to use bean_regex. I created a draft PR to the documentation repo with some doc changes: DataDog/documentation#35575 But it looks like not all docs are in this repo. I did the best I could 😅

move use_canonical_bean_name to be at the include/bean level rather than globally for the config (or support both)

It is a good idea. I also added one more commit that supports use_canonical_bean_name on the init_config level. I missed it initially. And on top of that I added a commit that supports use_canonical_bean_name on the filter-level.

Copy link
Copy Markdown
Contributor

@carlosroman carlosroman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for adding this feature. Will work on getting this released soon.

@gh-worker-dd-mergequeue-cf854d gh-worker-dd-mergequeue-cf854d Bot merged commit 795b2ee into DataDog:master Apr 10, 2026
11 checks passed
@LMnet LMnet deleted the bean-props-order branch April 12, 2026 23:07
@LMnet
Copy link
Copy Markdown
Contributor Author

LMnet commented Apr 12, 2026

Nice, thank you!

I marked docs PR as ready for review: DataDog/documentation#35575

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants