Memorization? Maybe?

​I am investigating the generalization capabilities of the TRM architecture presented in "Less is More." While the paper claims the model learns to "tease out underlying task rules" through recursive refinement, recent independent analyses (and my own replication attempts) suggest the model is heavily reliant on the learned Task_ID embeddings rather than inferring logic from the input grid itself.
​The Technical Issue
​When the specific Task_ID is removed or randomized, the model's reasoning capabilities appear to collapse completely, suggesting it is performing conditional retrieval (lookup) rather than fluid intelligence.
​Observed Behavior (Ablation Results):
​Standard Input (Grid + Correct ID): ~45% Accuracy (Matches Paper)
​Ablation Input (Grid + Blank/Random ID): 0.0% Accuracy
​The Discrepancy
​The paper asserts that the 7M parameter network is solving the ARC tasks via recursive refinement. However, if the model requires a unique, pre-learned embedding vector for every single task to achieve a score >0%, this indicates the "logic" is encoded in the embedding table (memory), not the recursive weights (reasoning).
​Impact:
​Parameter Count: The claim of "7M parameters" excludes the massive embedding table required to store these task-specific priors.
​Generalization: A model that fails completely without a task-specific tag cannot be claimed to solve "unseen" tasks in a general sense, as it requires a learned index for that specific problem distribution.

Can you provide a checkpoint or a script where the model successfully solves any unseen puzzle without accessing the specific Task_ID embedding for that puzzle?
​If not, how does this architecture differ from a learned lookup table?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memorization? Maybe? #60

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Memorization? Maybe? #60

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions