Skip to content

Lazy import (after #11)#12

Open
paugier wants to merge 2 commits into
willynilly:mainfrom
paugier:lazy-import
Open

Lazy import (after #11)#12
paugier wants to merge 2 commits into
willynilly:mainfrom
paugier:lazy-import

Conversation

@paugier

@paugier paugier commented Jan 7, 2026

Copy link
Copy Markdown

I tried again lazy import (#8) after #11:

For this branch (lazy import):

$ time python -c "import rfc3987_syntax"
real	0m0,023s
$ time python -c "import rfc3987_syntax as m; m.syntax_parser"
real	0m0,088s

To be compared with, for agentydragon:adgn/one-parser

$ time python -c "import rfc3987_syntax"
real	0m0,088s
$ time python -c "import rfc3987_syntax as m; m.syntax_parser"
real	0m0,088s

and for main:

$ time python -c "import rfc3987_syntax"
real	0m0,650s
$ time python -c "import rfc3987_syntax as m; m.syntax_parser"
real	0m0,654s

agentydragon and others added 2 commits December 29, 2025 05:36
Previously, make_syntax_validator() created a new Lark parser for each
rule, which was expensive (~660ms total import time). This change:

- Defines ALL_START_RULES containing all grammar rules needed by validators
- Creates a single shared syntax_parser with all start rules upfront
- Refactors make_syntax_validator() to use functools.partial with
  is_valid_syntax() instead of creating new parsers

Import time reduced from ~660ms to ~72ms (9x speedup).

Testing protocol:
- Created timing script that clears module cache and measures import time
- Ran 5 iterations for each version to account for variance
- Used git stash to preserve dirty state while testing HEAD
- Results: HEAD averaged ~662ms, this change averages ~72ms
- All existing tests pass (2/2)

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to reduce rfc3987_syntax import time by lazily importing lark, lazily constructing the shared syntax_parser, and lazily creating term-specific is_valid_syntax_* validator functions on attribute access.

Changes:

  • Introduces module-level __getattr__-based lazy loading for lark, syntax_parser, and term-specific validators.
  • Refactors validator creation to use a shared parser and functools.partial.
  • Adds package-level __dir__/__getattr__ to expose lazily created attributes from rfc3987_syntax.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
src/rfc3987_syntax/syntax_helpers.py Implements lazy loading/caching for lark, syntax_parser, and validators; introduces ALL_START_RULES for shared parser starts.
src/rfc3987_syntax/init.py Adds package-level __dir__ and __getattr__ to surface lazily created validators/parser via the top-level package.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +49 to +55
# All start rules needed by validators
ALL_START_RULES: list[str] = RFC3987_SYNTAX_TERMS.copy()
for term in ("non_zero", "pct_encoded", "scheme"):
ALL_START_RULES.remove(term)
ALL_START_RULES.extend(
["ipath", "ipath_absolute", "ipath_empty", "ipath_noscheme", "ipath_rootless"]
)

Copilot AI Apr 30, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ALL_START_RULES removes "scheme" and "pct_encoded" (and "non_zero"), but tests and public API use these terms via is_valid_syntax(term=...). With the shared parser built using start=ALL_START_RULES, parse(term="scheme"/"pct_encoded") will fail (unknown start rule) or return incorrect results. Keep these terms in ALL_START_RULES (or handle them via a separate parser path) so parse()/is_valid_syntax() continue to work for all RFC3987_SYNTAX_TERMS.

Suggested change
# All start rules needed by validators
ALL_START_RULES: list[str] = RFC3987_SYNTAX_TERMS.copy()
for term in ("non_zero", "pct_encoded", "scheme"):
ALL_START_RULES.remove(term)
ALL_START_RULES.extend(
["ipath", "ipath_absolute", "ipath_empty", "ipath_noscheme", "ipath_rootless"]
)
# All start rules needed by validators and public parse()/is_valid_syntax() calls
ALL_START_RULES: list[str] = RFC3987_SYNTAX_TERMS.copy()
for term in ("ipath", "ipath_absolute", "ipath_empty", "ipath_noscheme", "ipath_rootless"):
if term not in ALL_START_RULES:
ALL_START_RULES.append(term)

Copilot uses AI. Check for mistakes.
Comment on lines +73 to +74
def parse(term: str, value: str) -> sys.modules[__name__].lark.ParseTree:
return sys.modules[__name__].syntax_parser.parse(value, start=term)

Copilot AI Apr 30, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return type annotation -> sys.modules[__name__].lark.ParseTree is evaluated at import time (no postponed-evaluation import is present), which will trigger the lazy lark import and defeats the purpose of this PR. It’s also an unusual/fragile typing pattern. Use postponed evaluation (e.g., from __future__ import annotations) and/or a string/TYPE_CHECKING-based annotation so lark isn’t imported just to define parse().

Copilot uses AI. Check for mistakes.
Comment on lines +5 to +9

def __dir__():
result = _syntax_helpers.__all__.copy()
for start_rule in ALL_START_RULES:
result.append(SYNTAX_VALIDATOR_PREFIX + start_rule)

Copilot AI Apr 30, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Defining __all__ in syntax_helpers means from .syntax_helpers import * (used here) now exports only names listed in syntax_helpers.all. Since the term-specific validators (e.g., is_valid_syntax_iri) are no longer in all, from rfc3987_syntax import * will stop exporting them compared to the previous behavior. If that’s not intended, include the validator names (as strings) in an appropriate all (package-level and/or syntax_helpers) without eagerly constructing the parser.

Suggested change
def __dir__():
result = _syntax_helpers.__all__.copy()
for start_rule in ALL_START_RULES:
result.append(SYNTAX_VALIDATOR_PREFIX + start_rule)
__all__ = _syntax_helpers.__all__.copy()
__all__.extend(
SYNTAX_VALIDATOR_PREFIX + start_rule for start_rule in ALL_START_RULES
)
def __dir__():
result = __all__.copy()

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants