Adding mypy check and test #422

datvo06 · 2025-12-04T19:27:15Z

A more detailed check and better type context construction for the Synthesis Handler.
Attempt to close #361 and subsumes #421

effectful/handlers/llm/synthesis.py

kiranandcode · 2025-12-08T18:34:42Z

This may not be possible to avoid but get sources doesn't work in the repl, so in the following code submitted to the Repl, the LLM does not see the definition of MovieSummary

@dataclass
class MovieSummary:
    name: str
    summary: str
    score: int

@Template.define
def get_movie_classification_function(database: str) -> Callable[[str], MovieSummary]:
    """Generate a function that makes CURL requests to retrieve a movie summary from the database {database}."""
    ...

with handler(LiteLLMProvider()), handler(ProgramSynthesis()):
    imdb_classification = get_movie_classification_function('imdb') 
    letterboxd_classification = get_movie_classification_function('letterboxd')
    print(inspect.getsource(imdb_classification))
    print(inspect.getsource(letterboxd_classification))
    print(imdb_classification("die hard!"))
    print(letterboxd_classification("die hard!"))

kiranandcode · 2025-12-08T18:42:33Z

effectful/handlers/llm/synthesis.py


-        source_code = textwrap.dedent(code)
+        # Construct the full function code
+        code = f"def {func_name}({param_sig}) -> {return_type}:\n{body}"


When I tried sending a few requests, the LLM often returns an entire module body not a function body.

I.e it returns something looking like:

print(result.body) import requests class MovieSummary: def __init__(self, title, summary, year, director, cast): self.title = title self.summary = summary self.year = year self.director = director self.cast = cast def __str__(self): return f"Title: {self.title}\nYear: {self.year}\nDirector: {self.director}\nCast: {', '.join(self.cast)}\nSummary: {self.summary}" def fetch_movie_summary_from_letterboxd(movie_title: str) -> MovieSummary: url = f"https://api.letterboxd.com/v0/movie/{movie_title}" headers = { 'Authorization': "Bearer YOUR_ACCESS_TOKEN", 'Content-Type': 'application/json' } response = requests.get(url, headers=headers) if response.status_code == 200: data = response.json() summary = data.get('summary', 'No summary available.') year = data.get('releaseYear', 'Unknown') director = data.get('director', 'Unknown') cast = data.get('cast', []) return MovieSummary(title=movie_title, summary=summary, year=year, director=director, cast=cast) else: raise Exception(f"Failed to fetch data from letterboxd: {response.status_code}") (Pdb)

So in this case, it would make more sense to compile the entire file as a module and extract the last definition. This code compiles successfully, but returns None and does no computation.

I.e the generated function is:

print(inspect.getsource(gs[func_name])) def fetch_movie_summary_from_letterboxd(movie_title: str) -> MovieSummary: import requests class MovieSummary: def __init__(self, title, summary, year, director, cast): self.title = title self.summary = summary self.year = year self.director = director self.cast = cast def __str__(self): return f"Title: {self.title}\nYear: {self.year}\nDirector: {self.director}\nCast: {', '.join(self.cast)}\nSummary: {self.summary}" def fetch_movie_summary_from_letterboxd(movie_title: str) -> MovieSummary: url = f"https://api.letterboxd.com/v0/movie/{movie_title}" headers = { 'Authorization': "Bearer YOUR_ACCESS_TOKEN", 'Content-Type': 'application/json' } response = requests.get(url, headers=headers) if response.status_code == 200: data = response.json() summary = data.get('summary', 'No summary available.') year = data.get('releaseYear', 'Unknown') director = data.get('director', 'Unknown') cast = data.get('cast', []) return MovieSummary(title=movie_title, summary=summary, year=year, director=director, cast=cast) else: raise Exception(f"Failed to fetch data from letterboxd: {response.status_code}")

Thanks! I agree. I noticed that this can happen frequently, maybe generating the full module is a better way.

One drawback is that we won't be able to force constrainted decoding if we generate the full module. But I think we already have a bigger problem than that. (cc'ing @jfeser: I'll try generate the full module instead )

I was going to suggest this as well. If you want the LLM to tell you the function name separately, that would be straightforward to do.

kiranandcode · 2025-12-08T18:47:08Z

effectful/handlers/llm/synthesis.py

            return fwd()

+        # Collect all types referenced in the signature
+        referenced_types = collect_referenced_types(ret_type)


why are the definitions only given for the return type? what about the arguments? Also I think it might make sense to do this based on lexical context as well. That would be more consistent with the semantics for tool calling as well.

I agree. I'm leaving this for a separate issue #427, but happy to incorporate this here once the prior issues are addressed.

kiranandcode · 2025-12-08T19:15:14Z

we could provide definitions of any functions in the context

def collect_lexical_functions() -> Dict[str, str]:
    lexical_context = {**globals(), **locals()}
    current_module_name = globals().get('__name__', '__main__')
    collected = {}
    for name, obj in lexical_context.items():
      if isinstance(obj, types.FunctionType) and \
         getattr(obj, "__module__", None) == current_module_name:
          try:
              collected[name] = textwrap.dedent(inspect.getsource(obj)).strip()
          except OSError:
              collected[name] = f"<function {obj.__name__} from {obj.__module__}>: {obj.__doc__}"

    return collected

datvo06 · 2025-12-08T19:47:13Z

@kiranandcode Thanks! Let me try that.

datvo06 · 2025-12-08T21:47:29Z

@kiranandcode @jfeser Thanks. I added the following:

Collecting lexical context in Template.define.
Add these symbols to the synthesized function module.
Added tests to check that synthesized functions can use the other functions in scope.

Also closes #427

datvo06 · 2025-12-08T21:49:52Z

Maybe we can also treat types the same way: they are also symbols within this lexical scope.

eb8680 · 2025-12-08T21:54:14Z

Can we do the lexical context collection in a separate PR? It's a critical element of the overall design and as discussed in #427 it's not specific to synthesis.

datvo06 · 2025-12-08T22:08:24Z

@eb8680 yes. Let me factor it out.

…referred to in Template

datvo06 · 2025-12-09T00:35:46Z

I refractored lexical context to #434, I will rebase this on top of that one.

datvo06 · 2025-12-16T20:21:43Z

Note: One problem I encountered from test_handlers_llm_provider_synthesis and test_handlers_llm_provider_image is that __future__ annotations would break Encodable.
With the import:

from __future__ import annotations

def foo(x: str) -> Callable[[str], int]: ...

Annotations are stored as strings: "str", "Callable[[str], int]".
Without the import:

def foo(x: str) -> Callable[[str], int]: ...

Annotations are stored as actual types: str, Callable[[str], int].

When it comes to type_to_encodable_type, it uses singledispatch, which caches results in a WeakKeyDictionary.
When a string is passed in:

type_to_encodable_type("str")

Then it would try to cache str as a key

# WeakKeyDictionary.cache["str"] = ...

Leading to

TypeError: cannot create weak reference to 'str' object

Strings (and other built-in immutable types) cannot be weakly referenced.
For now, I avoided from __future__ import annotations in these tests.

…a function that create function

eb8680 · 2025-12-17T21:45:45Z

effectful/handlers/llm/synthesis.py

+    t = str
+
+    @classmethod
+    def encode(cls, context: LexicalContext) -> str:


I would expect EncodableLexicalContext.encode or its equivalent to recursively call type_to_encodable_type, not implement custom serialization logic for a bunch of other types internally.

eb8680 · 2025-12-17T21:50:13Z

effectful/handlers/llm/synthesis.py

+    right form and with the right type.
+    """
+
+    def __init__(self, type_check: bool = False):


I would expect all of this typechecking-related logic to live in a separate handler for a separate operation typecheck that is called by some other handler, not be part of ProgramSynthesis and its handler for Template.__call__.

eb8680 · 2025-12-17T21:55:00Z

effectful/handlers/llm/synthesis.py

+
+
+@type_to_encodable_type.register(LexicalContext)
+class EncodableLexicalContext(


I'm not sure we want a EncodableLexicalContext of this form at all - we should never be decoding a LexicalContext instance from the output of an LLM and it probably shouldn't be possible to do so in the first place. The encoding logic should probably live somewhere else.

eb8680 · 2025-12-17T21:58:54Z

effectful/handlers/llm/synthesis.py

+        sources: list[str] = []
+        seen_names: set[str] = set()
+
+        for name, obj in context.items():


All the cases in this loop should really be standalone Encodable implementations for the relevant types, e.g. types.ModuleType, collections.abc.Callable, Template, etc.

I would strongly suggest doing each of those in its own standalone PR to expedite code review and improve code quality.

eb8680 · 2025-12-17T22:00:01Z

effectful/handlers/llm/synthesis.py

+        return "\n\n".join(sources)
+
+    @classmethod
+    def decode(cls, source: str) -> LexicalContext:


As noted above, we should never be decodeing a LexicalContext - decode is for the result of an LLM call, and LLM calls should never be generating LexicalContexts.

eb8680 · 2025-12-17T22:03:26Z

effectful/handlers/llm/encoding.py

+# NOTE: Register str explicitly to avoid WeakKeyDictionary cache issues with singledispatch
+@type_to_encodable_type.register(str)
+def _type_encodable_type_str[T](ty: type[T]) -> Encodable[T]:
+    return _type_encodable_type_base(ty)


Not really sure from the comment what's meant to be happening here but I don't think this is correct? At the very least, whatever the issue is should probably be resolved closer to the source.

eb8680 · 2025-12-17T22:07:42Z

effectful/handlers/llm/synthesis.py

+        if not is_callable:
            return fwd()

+        # Include the full lexical context - all functions, types, values available to synthesized code


I'm not sure we want to do this by default? It seems like it could quickly blow up the context. At the very least we shouldn't be incorporating all source code for every function, class and module by default.

datvo06 · 2025-12-22T02:08:09Z

Some tests from Jax are failing. Would merging staging-llm fix this?

eb8680 · 2025-12-22T14:09:34Z

@datvo06 can we please break out each of the new Encodables in this PR into smaller standalone PRs, one for each case? They still need work, but they aren't specific to synthesis or typechecking, and as always smaller PRs mean faster review cycles. Once those land, the remaining typechecking-specific changes here will be simple and straightforward.

Some tests from Jax are failing. Would merging staging-llm fix this?

They're failing on master too (#452) so I think we'll need to fix this on master and merge that into staging-llm first.

datvo06 · 2025-12-22T14:11:24Z

@datvo06 can we please break out each of the new Encodables in this PR into smaller standalone PRs, one for each case? They still need work, but they aren't specific to synthesis or typechecking, and as always smaller PRs mean faster review cycles. Once those land, the remaining typechecking-specific changes here will be simple and straightforward.

Some tests from Jax are failing. Would merging staging-llm fix this?

They're failing on master too (#452) so I think we'll need to fix this on master and merge that into staging-llm first.

Thanks! I'm on it.

datvo06 requested review from eb8680, jfeser and kiranandcode December 4, 2025 19:27

datvo06 mentioned this pull request Dec 4, 2025

Check signature of returned Callable for ProgramSynthesisHander #421

Open

datvo06 self-assigned this Dec 4, 2025

datvo06 added module:llm status:awaiting review labels Dec 4, 2025

jfeser requested changes Dec 4, 2025

View reviewed changes

effectful/handlers/llm/synthesis.py Outdated Show resolved Hide resolved

effectful/handlers/llm/synthesis.py Outdated Show resolved Hide resolved

effectful/handlers/llm/synthesis.py Outdated Show resolved Hide resolved

effectful/handlers/llm/synthesis.py Outdated Show resolved Hide resolved

datvo06 requested a review from jfeser December 4, 2025 20:35

datvo06 linked an issue Dec 5, 2025 that may be closed by this pull request

effectful.handlers.llm.synthesis should type-check generated code #361

Open

datvo06 mentioned this pull request Dec 5, 2025

Support additional contexts for ProgramSynthesis #427

Open

kiranandcode requested changes Dec 8, 2025

View reviewed changes

datvo06 added 6 commits December 8, 2025 17:29

Adding lexical context collection

8cfe854

Allow model input to refer to anything within the lexical context

4122acc

Different handling between representable object and other types when …

28fc2c9

…referred to in Template

More edge case handling

126cab4

More edge case handling

bec5e06

More edge case handling

a768749

datvo06 added 4 commits December 8, 2025 22:07

Adding mypy check and test

aabb0b7

SynthesizedFunction for constrained decoding, adapt the tests

55cd283

Linting

a232926

Let LLM generate param names

61a9432

datvo06 added 7 commits December 16, 2025 13:50

Factor out image LLM test for less flaky

ad4d603

Linting

7aaa173

Include factored out test

8dd6f16

Fix TypeError weak reference to str

8f8c614

Fix misisng collections import

8b0251c

Linting

5a6a8ea

Add immutable lexical context test

53b2b95

datvo06 mentioned this pull request Dec 16, 2025

Class and Type Synthesis #444

Open

datvo06 added 2 commits December 16, 2025 16:18

Readding type check mypy

62c649a

Update synthesis prompt so that it will not misunderstand and return …

23309f1

…a function that create function

datvo06 requested review from eb8680 and jfeser December 16, 2025 21:42

datvo06 added 3 commits December 16, 2025 16:44

Update

54b72ea

Update prompt to avoid factory generation again

59ecdd0

Remove unnecesary redefinition

7ac9471

eb8680 reviewed Dec 17, 2025

View reviewed changes

datvo06 added 10 commits December 18, 2025 11:23

Use type_to_encodable_type for Lexical context encoding

3b07059

Remove ability to "decode" LexicalContext

dec0728

Separating type check to a separate operation

170964e

Linting

4978a45

Successfully move type check to a separate ops

70cb3ee

Update notebook

afcc03f

Lint

11dfa52

Remove unnecessary encoding

eff6e2b

defaulting program synthesis to not include_body

3067a34

Lint

02c3237



		@type_to_encodable_type.register(LexicalContext)
		class EncodableLexicalContext(

Adding mypy check and test #422

Are you sure you want to change the base?

Adding mypy check and test #422

Uh oh!

Conversation

datvo06 commented Dec 4, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kiranandcode commented Dec 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kiranandcode commented Dec 8, 2025

Uh oh!

datvo06 commented Dec 8, 2025

Uh oh!

datvo06 commented Dec 8, 2025

Uh oh!

datvo06 commented Dec 8, 2025

Uh oh!

eb8680 commented Dec 8, 2025

Uh oh!

datvo06 commented Dec 8, 2025

Uh oh!

datvo06 commented Dec 9, 2025

Uh oh!

datvo06 commented Dec 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

datvo06 commented Dec 22, 2025

Uh oh!

eb8680 commented Dec 22, 2025

Uh oh!

datvo06 commented Dec 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

kiranandcode commented Dec 8, 2025 •

edited

Loading

datvo06 commented Dec 16, 2025 •

edited

Loading