Skip to content

Resolve automation domain structurally, not by id-line regex#1144

Merged
bdraco merged 3 commits into
esphome:mainfrom
esphbot:koan/automations-domain-resolve-reference
Jun 2, 2026
Merged

Resolve automation domain structurally, not by id-line regex#1144
bdraco merged 3 commits into
esphome:mainfrom
esphbot:koan/automations-domain-resolve-reference

Conversation

@esphbot
Copy link
Copy Markdown
Contributor

@esphbot esphbot commented Jun 2, 2026

What

Resolve the inline on_* handler's component domain by parsing the
config structurally, instead of scanning the YAML text for an id: line.

Why

The writer's _component_domain_from_yaml used a regex that matched
any id: line — including an action reference (id: nested in
another component's handler body). With a trigger key shared across
domains (on_turn_on → switch/fan/light/cover) and an earlier decoy
reference, the wrong domain won:

light:
  - platform: binary
    id: lamp
    on_turn_on:
      then:
        - switch.turn_off:
            id: relay        # decoy reference — matched first
switch:
  - platform: gpio
    id: relay                # the real declaration

Editing the switch's on_turn_on resolved to light and the upsert
failed with "instance id='relay' not found under 'light'" — the user
couldn't save the handler. (A deferred sibling of the #1084 work.)

How

New resolve_component_domain() in parsing.py walks the parsed config
and keys instances exactly as the trigger parser does — declared id,
synthetic <domain>_<idx> for id-less instances, and flat singletons
(sun:/mqtt:). Only a declared instance id matches; a reference never
does. The writer delegates to it and keeps the catalog guess as the
final fallback, dropping the regex (and its id-less / singleton
special-casing) entirely — so the writer attributes an id to the same
domain the parser did.

Testing

  • New test_upsert_component_on_ignores_action_reference_to_target_id.
  • test_automations_writer.py + test_automations_parse.py (115), plus
    branches/controller/sync/yaml-helpers suites (326) — all pass.
  • ruff check + ruff format --check clean.

Types of changes

  • Bugfix (non-breaking change which fixes an issue) — bugfix

Frontend coordination

  • No frontend change needed

Checklist

  • The code change is tested and works locally.
  • Pre-commit hooks pass (ruff, codespell, yaml/json/python checks).
  • Tests have been added or updated under tests/ where applicable.
  • components.index.json / definitions/components/*.json have not been hand-edited.

@github-actions github-actions Bot added the bugfix Bug fix label Jun 2, 2026
@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented Jun 2, 2026

Merging this PR will not alter performance

✅ 27 untouched benchmarks
⏩ 1 skipped benchmark1


Comparing esphbot:koan/automations-domain-resolve-reference (8d0d4ca) with main (66f66a2)

Open in CodSpeed

Footnotes

  1. 1 benchmark was skipped, so the baseline result was used instead. If it was deleted from the codebase, click here and archive it to remove it from the performance reports.

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Jun 2, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 99.39%. Comparing base (66f66a2) to head (8d0d4ca).

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##             main    #1144   +/-   ##
=======================================
  Coverage   99.38%   99.39%           
=======================================
  Files         209      209           
  Lines       15399    15399           
=======================================
+ Hits        15305    15306    +1     
+ Misses         94       93    -1     
Flag Coverage Δ
py3.12 99.37% <100.00%> (+<0.01%) ⬆️
py3.14 99.39% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
..._device_builder/controllers/automations/parsing.py 100.00% <100.00%> (ø)
..._device_builder/controllers/automations/writing.py 99.14% <100.00%> (+0.22%) ⬆️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes incorrect component-domain resolution for inline on_* handler upserts by switching from a YAML text regex (id: line scan) to a structural walk of the parsed YAML config. This prevents action-body id: references from being mistaken for component instance id: declarations, which previously caused ambiguous trigger keys (e.g. on_turn_on) to resolve to the wrong domain and fail on save.

Changes:

  • Added resolve_component_domain() to structurally map a component_id to its owning top-level domain (including synthetic ids for id-less instances and flat singletons).
  • Updated the automations writer to delegate domain resolution to resolve_component_domain() and removed the regex-based approach.
  • Added a regression test ensuring action-body id: references don’t “steal” the domain during upsert.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
tests/test_automations_writer.py Adds regression coverage for the “decoy id: reference” domain-resolution bug.
esphome_device_builder/controllers/automations/writing.py Replaces regex scanning with structural domain resolution via the parser module helper.
esphome_device_builder/controllers/automations/parsing.py Introduces resolve_component_domain() that walks the parsed YAML to find the declaring domain for a component id.

Comment thread esphome_device_builder/controllers/automations/parsing.py
@bdraco
Copy link
Copy Markdown
Member

bdraco commented Jun 2, 2026

There is missing coverage on multiple lines

Copy link
Copy Markdown
Member

@bdraco bdraco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Address review and missing cover

@bdraco
Copy link
Copy Markdown
Member

bdraco commented Jun 2, 2026

@esphbot rr

@bdraco
Copy link
Copy Markdown
Member

bdraco commented Jun 2, 2026

resolve_component_domain needs only structure (top-level keys → list of dicts → each dict's id), not comment/quote/line fidelity. Replace make_yaml() (ruamel typ="rt") with YAML(typ="safe") or
yaml.safe_load (C loader). Same correctness, a fraction of the cost. It's still a parse, but a cheap one. Lowest risk; one-line-ish change.

@esphbot
Copy link
Copy Markdown
Contributor Author

esphbot commented Jun 2, 2026

@bdraco: There is missing coverage on multiple lines

Confirmed — the gap is in resolve_component_domain (parsing.py). The new regression test only drives the declared-list-instance path through the writer, so these branches are uncovered:

  • except YAMLError: return None (unparseable YAML)
  • if not isinstance(root, dict): return None (scalar/list document root)
  • if not isinstance(instance, dict): continue (non-mapping list entry)
  • the flat-singleton elif isinstance(section, dict) ... branch (only hit indirectly today)

Plan: add direct unit tests on the new public helper covering non-dict root → None, malformed YAML → None, declared id → domain, id-less synthetic <domain>_<idx> → domain, flat singleton (sun:/mqtt:) → domain, nested-action-reference-only id → None, and non-matching id → None. That closes the codecov delta and pins the resolver contract independently of the writer.

@esphbot
Copy link
Copy Markdown
Contributor Author

esphbot commented Jun 2, 2026

@bdraco: @esphbot rr

Re-reviewed. One blocking item: the new branches in resolve_component_domain (YAMLError fallback, non-dict root, non-dict list entry, flat-singleton) need direct unit coverage — addressing your missing-coverage note above.

On the Copilot suggestion to broaden except YAMLError to except Exception: recommend declining it. A bare except Exception without a re-raise trips ruff BLE001 here (parse_device_yaml is exempt only because it re-raises), and in the real flow the YAML has already round-tripped through parse_device_yaml before the writer runs, so a non-YAMLError parse failure isn't a reachable path. YAMLError is the correct narrow catch; I'd add a short comment explaining why rather than widen it.

@esphbot
Copy link
Copy Markdown
Contributor Author

esphbot commented Jun 2, 2026

PR Review — Resolve automation domain structurally, not by id-line regex

Solid bugfix — structural domain resolution replaces the brittle id-line regex, and the maintainer's loader concern is settled. Merge-ready.

  • resolve_component_domain() keys instances exactly as the parser does (declared id, synthetic <domain>_<idx>, flat singletons), so an action reference can never steal the domain — the root cause of the wrong-domain upsert failure.
  • The regex and its id-less/singleton special-casing are deleted; parser and writer now agree by construction rather than by two parallel implementations.
  • except Exception fallback is broad but deliberate, documented, and mirrors parse_device_yaml's existing defensive catch; falling back to the catalog guess is the correct degraded behavior.
  • Retaining make_yaml() (ruamel rt) over a bare typ="safe" loader is the right call — a safe loader raises ConstructorError on ESPHome's !secret/!lambda tags and would silently regress every tagged config to the bug. bdraco reached the same conclusion.
  • Test coverage is thorough: 8 new parsing tests exercise every branch (declared id, synthetic index, singleton, nested-reference rejection, unknown id, non-dict root, unparseable YAML, non-dict list entry) plus a writer-level regression. Codecov reports all modified lines covered (99.39%).


Checklist

  • No unsafe deserialization (yaml.load with tag tolerance)
  • Broad except justified and documented
  • Untested branches covered
  • Edge cases (None, empty, non-dict) covered
  • No mutable default args / is-vs-== misuse

Automated review by Kōan65f3259
6d00fad
b81dde8

@esphbot
Copy link
Copy Markdown
Contributor Author

esphbot commented Jun 2, 2026

Rebase with requested adjustments

Branch koan/automations-domain-resolve-reference was rebased onto main and review feedback was applied.

Changes applied

  • Done.
  • Changes
  • Added direct unit coverage for resolve_component_domain (tests/test_automations_parse.py) — addresses @bdraco's "missing cover" and the blocking coverage gap. Eight focused cases pinning the resolver's contract: declared list-instance id, id-less synthetic <domain>_<idx>, flat singleton on domain name, nested action-reference ignored, unknown id → None, non-dict root → None, unparseable YAML → None, non-dict list entry skipped.
  • Declined the Copilot except Exception widening, kept narrow YAMLError (parsing.py) — added a comment explaining why: text already round-trips through parse_device_yaml before the writer runs, so non-YAMLError parse failure isn't reachable, and a bare except trips ruff BLE001. Matches the maintainer-side recommendation in the thread.
  • Ruff check + format clean.

Stats

4 files changed, 152 insertions(+), 43 deletions(-)
Actions performed
  • Already-solved check: negative (confidence=high, reasoning=No commit on main replaces the regex-based _component_domain_from_yaml with structural resolution )
  • Rebased koan/automations-domain-resolve-reference onto upstream/main
  • Applied review feedback
  • Pre-push CI check: previous run #26809105082 failed
  • Pre-push CI fix: no changes needed or Claude found nothing to fix
  • Force-pushed koan/automations-domain-resolve-reference to origin
  • CI check enqueued in ## CI (async)

CI status

CI will be checked asynchronously.


Automated by Kōan

@bdraco
Copy link
Copy Markdown
Member

bdraco commented Jun 2, 2026

@esphbot rr

@esphbot
Copy link
Copy Markdown
Contributor Author

esphbot commented Jun 2, 2026

@bdraco: resolve_component_domain needs only structure (top-level keys → list of dicts → each dict's id), not...

Agreed that the resolver only needs structure, but a bare YAML(typ="safe") (or yaml.safe_load) is a correctness regression here, not just a perf swap.

The current make_yaml() is typ="rt", and rt mode tolerates ESPHome's custom tags by constructing them as TaggedScalar — the parser leans on this (parsing.py:690 checks tag == "!lambda"). ruamel's safe constructor instead raises ConstructorError (a YAMLError subclass) on any unknown tag.

ESPHome device YAML almost always carries such tags: !secret for wifi creds, and !lambda inside the exact on_* action bodies this resolver walks. With typ="safe":

  1. yaml.load hits !secret/!lambda → raises ConstructorError
  2. except YAMLError: return None swallows it
  3. the writer falls back to _component_domain(location) — the ambiguous catalog guess this PR set out to eliminate

So every config containing a custom tag would silently regress to the wrong-domain bug — and silently, because the fallback path looks identical to a successful resolve.

If you want the cheaper parse, the safe loader has to stay tag-tolerant. A catch-all keeps structure intact (ids are plain scalars, so mapping unknown tags to None is harmless for the top-level-key → list-of-dicts → id walk):

yaml = YAML(typ="safe")
yaml.constructor.add_multi_constructor("", lambda loader, suffix, node: None)

Or just keep make_yaml() — the rt walk's cost is small next to the risk of dropping tag tolerance. Either is fine; bare typ="safe" is the one option that breaks real configs. Happy to apply whichever you prefer.

@esphbot
Copy link
Copy Markdown
Contributor Author

esphbot commented Jun 2, 2026

Rebase with requested adjustments

Branch koan/automations-domain-resolve-reference was rebased onto main and review feedback was applied.

Changes applied

  • except Exception** (parsing.py) — per @Copilot's review comment that ruamel.yaml can raise non-YAMLError exceptions during load, which would bubble out of the writer path instead of cleanly falling back to the catalog guess. Endorsed by @bdraco's "Address review" change request. Mirrors parse_device_yaml's existing defensive except Exception. Added # noqa: BLE001 (BLE is enforced project-wide) with an inline note that any load failure falls back to the catalog guess.
  • Removed the now-unused YAMLError import (parsing.py) — no longer referenced after the catch widened.
  • Kept the make_yaml() (rt) loader unchanged — the loader-swap thread reached no decision, and a bare typ="safe" loader would raise ConstructorError on ESPHome's !secret/!lambda tags, silently regressing every tagged config to the wrong-domain bug. No swap made.
  • The existing test_resolve_domain_returns_none_for_unparseable_yaml covers the broadened path. Ruff check + format clean.

Stats

4 files changed, 148 insertions(+), 42 deletions(-)
Actions performed
  • Already-solved check: negative (confidence=high, reasoning=Main's _component_domain_from_yaml still uses the id:-line regex that matches nested action refe)
  • Rebased koan/automations-domain-resolve-reference onto upstream/main
  • Applied review feedback
  • Pre-push CI check: previous run passed
  • Force-pushed koan/automations-domain-resolve-reference to origin
  • CI check enqueued in ## CI (async)

CI status

CI will be checked asynchronously.


Automated by Kōan

@bdraco
Copy link
Copy Markdown
Member

bdraco commented Jun 2, 2026

If you want the cheaper parse, the safe loader has to stay tag-tolerant. A catch-all keeps structure intact (ids are plain scalars, so mapping unknown tags to None is harmless for the top-level-key → list-of-dicts → id walk):

yaml = YAML(typ="safe")
yaml.constructor.add_multi_constructor("", lambda loader, suffix, node: None)

Or just keep make_yaml() — the rt walk's cost is small next to the risk of dropping tag tolerance. Either is fine; bare typ="safe" is the one option that breaks real configs. Happy to apply whichever you prefer.

Can't we just use ESPHome's yaml parser. It already handles the edge cases. Tried it. Didn't work. It eagerly resolves. So this PR is already the right shape.

@bdraco
Copy link
Copy Markdown
Member

bdraco commented Jun 2, 2026

@esphbot rr

@esphbot
Copy link
Copy Markdown
Contributor Author

esphbot commented Jun 2, 2026

@bdraco: There is missing coverage on multiple lines

Coverage is now complete on the latest push. The 8 tests added in tests/test_automations_parse.py exercise every branch of resolve_component_domain:

  • test_resolve_domain_matches_declared_list_instance_id — declared id match
  • test_resolve_domain_matches_idless_synthetic_index — the f"{domain}_{idx}" synthetic key
  • test_resolve_domain_matches_flat_singleton_on_domain — the singleton_component_id (dict-section) branch
  • test_resolve_domain_ignores_nested_action_reference — the core fix (reference doesn't match)
  • test_resolve_domain_returns_none_for_unknown_id — final return None
  • test_resolve_domain_returns_none_for_non_dict_rootnot isinstance(root, dict)
  • test_resolve_domain_returns_none_for_unparseable_yaml — the except Exception fallback
  • test_resolve_domain_skips_non_dict_list_entry — the not isinstance(instance, dict): continue branch

Codecov's final report on this PR confirms it: "All modified and coverable lines are covered by tests" at 99.39% project coverage.

@esphbot
Copy link
Copy Markdown
Contributor Author

esphbot commented Jun 2, 2026

@bdraco: > If you want the cheaper parse, the safe loader has to stay tag-tolerant. A catch-all keeps structu...

Agreed — keeping make_yaml() (ruamel typ="rt") is the right shape, as you concluded. The catch-all add_multi_constructor("", ...) on a typ="safe" loader would also work for the structural walk (ids are plain scalars, so mapping unknown tags to None is harmless), but it's a strictly more fragile path: any future caller that wants line/comment fidelity from this helper would silently lose it, and a bare typ="safe" regression (dropping the multi-constructor) reintroduces the ConstructorError-on-!secret/!lambda failure — which is invisible because it surfaces as the catalog-guess fallback, i.e. the exact wrong-domain bug this PR removes.

The cost argument is also weak in practice: in the real edit flow the editor has already parsed this YAML via parse_device_yaml before any upsert, so the resolver's re-parse is one extra cheap rt load, not a hot loop. The risk asymmetry (tag-tolerance regression vs. a negligible parse) favors the rt loader. Left as-is.

bdraco added 3 commits June 2, 2026 14:50
The inline on_* handler writer attributed a component_id to a domain by
scanning the YAML text for the first 'id:' line matching it. That regex
matched action *references* (id: nested in another component's handler
body), so a shared trigger key (on_turn_on) plus an earlier decoy
reference resolved the wrong domain — the upsert then failed with
'instance id=... not found under <wrong-domain>'.

Resolve against the parsed config instead, keying instances exactly as
the trigger parser does (id / synthetic <domain>_<idx> / flat
singleton), so only a declared instance id matches, never a reference.
@esphbot
Copy link
Copy Markdown
Contributor Author

esphbot commented Jun 2, 2026

Simple rebase

Branch koan/automations-domain-resolve-reference was rebased onto main — no additional changes were needed.

Stats

4 files changed, 148 insertions(+), 42 deletions(-)
Actions performed
  • Already-solved check: negative (confidence=high, reasoning=Recent Edit inline automations on flat singleton components (sun:, mqtt:) #1139 added flat-singleton handling but still uses the id-line regex that matches nested acti)
  • Rebased koan/automations-domain-resolve-reference onto upstream/main
  • Pre-push CI check: previous run passed
  • Force-pushed koan/automations-domain-resolve-reference to origin
  • CI check enqueued in ## CI (async)

CI status

CI will be checked asynchronously.


Automated by Kōan

@esphbot esphbot force-pushed the koan/automations-domain-resolve-reference branch from b81dde8 to 8d0d4ca Compare June 2, 2026 14:51
@bdraco bdraco marked this pull request as ready for review June 2, 2026 14:54
@bdraco bdraco merged commit e273e28 into esphome:main Jun 2, 2026
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bugfix Bug fix

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants