feature(workflows): added a agentic workflow for sudoku #481

AdnanQureshi3 · 2026-01-16T12:06:30Z

#470
Solved the issue: Implement a New Workflow
feat: add SudokuWorkflow example implementation and register in default_mapping

Checklist

Added sudoku_workflow.py with full docstrings and inline comments

Implemented simple single-turn Sudoku solving workflow
Added reward calculation based on exact match
Registered workflow in trinity.common.workflows.init in alphabetical order
Ensured code follows style and passes pre-commit checks

If need to implement more custom workflows then please let me know

pan-x-c · 2026-01-16T13:12:36Z

trinity/common/workflows/sudoku_workflow.py

Thanks for your contribution. The Sudoku has some similarities to frozen lake.
And the current version has significant room for improvement.

A qualified Sudoku workflow should include three parts:

1.A Sudoku generator: Automatically generate solvable Sudoku puzzles and allow you to set the difficulty level.
2. An agentic workflow to solve the Sudoku: Some Sudoku is hard to solve in just one step, so an agentic workflow should be designed to solve the game in multiple steps.
3. A general judge function: Some Sudoku puzzles may have multiple possible solutions, the judge function should correctly parse the model's output and determine the correctness of the result according to the Sudoku rules, not just exactly match.

Hi @pan-x-c ,
Thanks for the detailed feedback earlier.

I’ve implemented all the requested changes:

Added a SudokuGenerator that produces solvable puzzles (with adjustable difficulty via hole count)

Reworked the workflow into a multi-step agentic loop, similar in structure to FrozenLakeWorkflow

Added a SudokuJudge that validates rows, columns, and 3×3 blocks instead of exact string matching

Integrated generator + judge inside the workflow

Updated workflow registry

Please have a look and let me know if you’d like further improvements or additional refinements.

pan-x-c · 2026-01-17T06:26:06Z

trinity/common/workflows/sudoku_generator.py

+    - Removes 'holes' positions to create a puzzle
+    """
+
+    BASE_SOLUTION = [


Relying on a single standard answer to generate Sudoku puzzles can easily lead to overfitting. Existing works (e.g., python-sudoku-generator-solver) can be referenced for the generation and evaluation parts.

pan-x-c · 2026-01-17T06:31:15Z

trinity/common/workflows/sudoku_workflow.py

+
+        for step in range(self.max_steps):
+            prompt = f"""
+Solve Sudoku by giving moves one at a time.


The prompts are important for agentic workflow. They should precisely describe the game rules and the tasks required to do at each step, as well as the output format. In some cases, even a few-shot example may be necessary. The design of prompts can also draw some inspiration from the Frozen Lake example.

I’ve updated the Sudoku workflow by improving the generator to avoid a single canonical solution and refining the prompt to clearly describe the rules, step-wise task, and strict output format, inspired by the Frozen Lake example.

Please let me know if any further refinements are needed.

pan-x-c · 2026-01-19T13:13:39Z

Sorry for the late reply. The current workflow code structure basically meets the requirements, but there is still some room for improvement in the details. For example:

Currently, self.board is directly represented as an array, rather than using a string format similar to the frozen lake render for the prompt. This may affect the model's understanding.
The generate function can be further improved. Although the difficulty (number of empty cells) can be adjusted, all questions are generated based on a single standard answer, which may lead to overfitting.
The current design fills only one cell per step. While this is simple, considering that Sudoku has many empty cells, it may result in too many interaction rounds, overly long context, and increased training costs. You might consider allowing the model to fill in multiple numbers at a time. Of course, this is just a personal suggestion, and the actual effect needs to be verified in practice.

If resources permit, I recommend running the workflow locally in debug mode to observe:

Whether the workflow can complete the game without errors (regardless of correctness).
Whether it can solve the Sudoku correctly with a certain probability (if all answers are wrong, RL training cannot proceed).

Additionally, since this example is relatively complex, I suggest converting some samples into unit tests to ensure the correctness of each module.

pan-x-c · 2026-01-19T13:22:48Z

If you find the 9x9 setting is too difficult, you can try 4x4 or 6x6 setting instead.

This Leaderboard may help you to build the workflow

feature(workflows): added a agentic workflow for sudoku

1bdcf82

pan-x-c reviewed Jan 16, 2026

View reviewed changes

feature: added Sudoku generator, and judge

fe71316

pan-x-c reviewed Jan 17, 2026

View reviewed changes

feature(workflows): improve SudokuWorkflow prompt and generator

3b8796d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feature(workflows): added a agentic workflow for sudoku #481

feature(workflows): added a agentic workflow for sudoku #481

AdnanQureshi3 commented Jan 16, 2026

Uh oh!

pan-x-c Jan 16, 2026

Uh oh!

AdnanQureshi3 Jan 16, 2026

Uh oh!

pan-x-c Jan 17, 2026

Uh oh!

pan-x-c Jan 17, 2026 •

edited

Loading

Uh oh!

AdnanQureshi3 Jan 17, 2026

Uh oh!

pan-x-c commented Jan 19, 2026

Uh oh!

pan-x-c commented Jan 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feature(workflows): added a agentic workflow for sudoku #481

Are you sure you want to change the base?

feature(workflows): added a agentic workflow for sudoku #481

Conversation

AdnanQureshi3 commented Jan 16, 2026

Checklist

Uh oh!

pan-x-c Jan 16, 2026

Choose a reason for hiding this comment

Uh oh!

AdnanQureshi3 Jan 16, 2026

Choose a reason for hiding this comment

Uh oh!

pan-x-c Jan 17, 2026

Choose a reason for hiding this comment

Uh oh!

pan-x-c Jan 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AdnanQureshi3 Jan 17, 2026

Choose a reason for hiding this comment

Uh oh!

pan-x-c commented Jan 19, 2026

Uh oh!

pan-x-c commented Jan 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

pan-x-c Jan 17, 2026 •

edited

Loading