Using getCanonicalName instead of toString for bean names to guarantee the order of key parameters in the bean name#597
Conversation
|
@codex review |
|
To use Codex here, create a Codex account and connect to github. |
|
@codex review |
|
Codex Review: Didn't find any major issues. Keep it up! ℹ️ About Codex in GitHubCodex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback". |
|
This PR looks good, I just have one concern on this. This might be a breaking change as people could be relying on the behaviour of their JVM. We've had something like this on our todo list as we wanted to support Canonical Name as tools like JMXTerm print bean names using that rather than Edit: |
…e the order of key parameters in the bean name
…onfig configuration
That's exactly why I hid this change behind the flag. And by default the flag is I believe that documentation should suggest enabling this flag if one wants to use
It is a good idea. I also added one more commit that supports |
carlosroman
left a comment
There was a problem hiding this comment.
Thank you for adding this feature. Will work on getting this released soon.
795b2ee
into
DataDog:master
|
Nice, thank you! I marked docs PR as ready for review: DataDog/documentation#35575 |
Problem
bean_regexmatching in JMXFetch relies onObjectName.toString()to produce the string that regexes are matched against.Here is the code from the
JmxAttributeclass that highlights it:However,
ObjectName.toString()does not guarantee the order of key properties in the MBean name.This means, a regex like
kafka.consumer:client-id=(.*),type=consumer-fetch-manager-metricsmay match or fail depending on whether the JVM happens to serialize properties in that order. Any combination is possible:And with more key properties it's getting even more complicated.
So, at the moment users have no reliable way to write robust
bean_regexpatterns.Solution
I added a new instance-level configuration parameter
use_canonical_bean_name(default:false) that, when enabled, usesObjectName.getCanonicalName()instead oftoString()for bean name string representation.getCanonicalName()sorts key properties alphabetically, producing a deterministic order. With this, beans will always have the same string representation, which makes regexp matching and extraction stable and robust.Configuration example:
The main change is this:
Depending on the flag, either old or new behavior will be enabled.
I hid this change behind the flag because this change may be potentially a breaking change for some users. If previously your config relied on a non-canonical string, enabling the new behavior will break regular expressions. Therefore, the default behavior is old behavior. However, I believe that the new behavior should be recommended. I already have a branch with the documentation changes where this new feature is explained, with the recommendation of using
use_canonical_bean_name: truefor all new setups.Logging changes
Additionally to the main change explained above, I also replaced all other
toStringcalls withgetCanonicalNamecalls. In practice all these changes affected only logging messages.I didn't hide them behind the flag, because it will make the code quite ugly and cumbersome: passing the flag everywhere, getting the right string based on the flag on each log call, it's kinda too much.
However, I'm not 100% sure about this part. I can imagine that one can rely on logging messages to figure out, what datadog agent is actually using for regexp matching. And if the
use_canonical_bean_nameflag is false, what you'll see in logs (always canonical strings) and what datadog will actually use for matching (toStringoutput) may not match exactly.I decided to procede with this change, because the new behavior is a lot more predictable, and users should enable
use_canonical_bean_nameif they want a stable regexp matching. Without it there is no guarantee, and this is the current behavior.Tests
The new functionality is covered with new test cases:
testBeanRegexUseCanonicalBeanName– regex written in canonical (alphabetical) order matches when the flag is true.testBeanRegexDontUseCanonicalBeanName– regex written in registration order matches when the flag is false (default)testBeanRegexCanonicalOrderNotMatchingWithoutFlag– canonical-order regex does NOT match when the flag is false, confirming the flag controls behavior (negative scenario).