Skip to content

Conversation

@adamamer20
Copy link
Member

@adamamer20 adamamer20 commented Dec 8, 2025

This PR introduces:

  1. a single, unified Typer CLI for running Mesa-Frames models with consistent options for agents, steps, seeds, plotting, and result persistence.
  • run command with standardised options
    (--agents, --steps, --seed, --plot, --save-results, --results-dir).
  • Automatic generation of plots (light + dark themes) for agent-level metrics and backend performance.
  1. A first version of the benchmarks CLI, allowing quick comparison of Mesa vs Frames backends on reference models (boltzmann, sugarscape) with automated CSV output and scaling plots.

Summary by CodeRabbit

  • New Features

    • Added CLI-driven performance benchmarking between backends, runnable example models (Boltzmann Wealth, Sugarscape IG) for both backends, and reusable plotting utilities for model/agent/performance metrics.
  • Documentation

    • New comprehensive READMEs describing benchmarks, examples, CLI options, outputs, and extension tips.
  • Chores

    • Updated ignore patterns and documentation tooling config (docs dependency added).

✏️ Tip: You can customize this high-level summary in your review settings.

- Implemented a new backend using Mesa with sequential updates in `examples/sugarscape_ig/backend_mesa`.
- Created agent and model classes for the Sugarscape simulation, including movement and sugar management.
- Added a CLI interface using Typer for running simulations and saving results.
- Introduced utility classes for handling simulation results from both Mesa and Mesa-Frames backends.
- Added a new backend using Mesa-Frames with parallel updates in `examples/sugarscape_ig/backend_frames`.
- Implemented model-level reporters for Gini coefficient and correlations between agent traits.
- Included CSV output and plotting capabilities for simulation metrics.
@codecov
Copy link

codecov bot commented Dec 8, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 89.23%. Comparing base (172cf28) to head (991eb8a).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #188   +/-   ##
=======================================
  Coverage   89.23%   89.23%           
=======================================
  Files          14       14           
  Lines        2007     2007           
=======================================
  Hits         1791     1791           
  Misses        216      216           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@adamamer20
Copy link
Member Author

@coderabbitai review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 8, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 8, 2025

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Adds a benchmarks CLI, example models (Boltzmann and Sugarscape) with both Mesa and mesa-frames backends, shared plotting/utilities, documentation, and minor config updates (pyproject, .gitignore). New Typer CLIs and data/result scaffolding produce timestamped CSVs and plots.

Changes

Cohort / File(s) Summary
Configuration & Dependencies
\.gitignore`, `pyproject.toml``
Updated gitignore glob patterns for benchmarks/examples artifacts; added typer>=0.9.0 to docs dependency group.
Benchmarks CLI & docs
\benchmarks/cli.py`, `benchmarks/README.md``
New Typer-based benchmarking CLI (multi-model, multi-backend, repeats, seeding, CSV output, plotting) and README describing usage, output layout, CLI options, CSV schema, and extension notes.
Examples root & utilities
\examples/init.py`, `examples/utils.py`, `examples/plotting.py`, `examples/README.md``
Package initializer exposing example symbols; new simulation result dataclasses (FramesSimulationResult, MesaSimulationResult); unified plotting helpers (plot_model_metrics, plot_agent_metrics, plot_performance) with multi-theme saving; examples README.
Boltzmann Wealth (Frames & Mesa)
\examples/boltzmann_wealth/backend_frames.py`, `examples/boltzmann_wealth/backend_mesa.py`, `examples/boltzmann_wealth/README.md``
Added Frames and Mesa implementations of Boltzmann Wealth model, each with simulate() and Typer CLI, gini computation, optional CSV/plot outputs, and run instrumentation.
Sugarscape IG — Frames backend
\examples/sugarscape_ig/backend_frames/init.py`, `examples/sugarscape_ig/backend_frames/agents.py`, `examples/sugarscape_ig/backend_frames/model.py`, `examples/sugarscape_ig/README.md``
New mesa-frames Sugarscape IG package: AntsBase/AntsParallel with vectorised neighborhood/ranking and iterative conflict-resolution rounds; Sugarscape model with instant-growback sugar logic, reporters (gini, correlations), simulate() and Typer CLI; README.
Sugarscape IG — Mesa backend
\examples/sugarscape_ig/backend_mesa/init.py`, `examples/sugarscape_ig/backend_mesa/agents.py`, `examples/sugarscape_ig/backend_mesa/model.py``
New Mesa-based Sugarscape IG: AntAgent movement/tie-break logic, Sugarscape model with DataCollector reporters, simulate(), Typer CLI, and plotting integration.

Sequence Diagram(s)

sequenceDiagram
    actor User
    participant CLI as benchmarks/cli.py
    participant ModelCfg as MODELS
    participant Backend as Backend Runner
    participant Sim as Simulation (Mesa/Frames)
    participant DC as DataCollector
    participant IO as File I/O (CSV/plots)
    participant Plot as examples/plotting.py

    User->>CLI: run(models, agents, steps, repeats, seed, save, plot)
    CLI->>CLI: parse inputs (_parse_agents/_parse_models)
    CLI->>ModelCfg: select model configs
    loop models
        CLI->>Backend: choose backend runner
        loop repeats
            CLI->>Backend: runner(agents, steps, seed)
            activate Backend
            Backend->>Sim: instantiate & run simulation
            Sim->>DC: collect metrics during run
            Sim-->>Backend: return runtime & datacollector
            Backend-->>CLI: return runtime, datacollector
            deactivate Backend
            CLI->>CLI: record runtime row
        end
        CLI->>CLI: aggregate runtimes into DataFrame
        alt save
            CLI->>IO: write CSVs to timestamped results_dir
        end
        alt plot
            CLI->>Plot: plot_performance(df, model_name, output_dir, timestamp)
            Plot->>IO: save PNG/SVG (light/dark themes)
        end
    end
    CLI->>User: report completion and paths
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Areas to focus review on:

  • benchmarks/cli.py — input parsing, seeding, repeat logic, aggregation and file I/O.
  • examples/sugarscape_ig/backend_frames/agents.py — parallel movement ranking and iterative conflict-resolution correctness and edge cases.
  • examples/plotting.py — DataFrame melting, multi-theme saving, and filename/metadata conventions.
  • Cross-backend consistency between Frames and Mesa implementations (metric definitions, CSV schema, timestamps).

Possibly related PRs

Suggested labels

docs

Suggested reviewers

  • EwoutH
  • Ben-geo

Poem

🐰
I hopped through models, plots in paw,
Tweaked backends, CSVs, and awe,
Frames and Mesa side by side,
Benchmarks logged — a joyful stride,
Carrots for code, let's run and draw! 🎨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 49.21% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title clearly describes the main objective: introducing a unified CLI interface for examples, adding plotting functionality, and creating a benchmarking system. It accurately summarizes the primary changes across the changeset.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch split/examples-benchmarks

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

🧹 Nitpick comments (12)
examples/__init__.py (1)

1-6: Clean and simple package initializer.

The module docstring and __all__ list are properly structured and follow Python conventions. The subpackage names use the correct snake_case naming style.

Note: The __all__ list declares these as public API surface, but without explicit imports, users will need to access them via the full path (e.g., import examples.boltzmann_wealth) rather than from examples import boltzmann_wealth. If direct importing from the examples package is desired, consider adding explicit imports:

from . import boltzmann_wealth, sugarscape_ig

This is optional based on your intended API—just ensure the usage pattern is intentional and documented.

examples/sugarscape_ig/backend_frames/agents.py (2)

298-298: Redundant .with_columns(pl.col("radius")) does nothing.

This line selects radius without any transformation, effectively a no-op.

-            .with_columns(pl.col("radius"))

311-311: Minor: Non-breaking hyphen in comment.

The comment contains a non-breaking hyphen () instead of a regular hyphen-minus (-). This is flagged by Ruff (RUF003) but is purely cosmetic.

-        # Precompute per‑agent candidate rank once so conflict resolution can
+        # Precompute per-agent candidate rank once so conflict resolution can
examples/boltzmann_wealth/backend_frames.py (1)

5-13: Non-standard import order.

Standard library imports (os, time) are interleaved with third-party imports. PEP 8 recommends grouping imports: stdlib first, then third-party.

 from datetime import datetime, timezone
+import os
 from pathlib import Path
+from time import perf_counter
 from typing import Annotated
 
 import numpy as np
-import os
 import polars as pl
 import typer
-from time import perf_counter
examples/plotting.py (3)

71-75: Consider logging the exception instead of silently ignoring it.

The try-except-pass pattern suppresses all errors during SVG export. While SVG is optional, logging the exception would aid debugging when exports fail unexpectedly.

-    try:
-        fig.savefig(output_dir / f"{stem}_{theme}.svg", bbox_inches="tight")
-    except Exception:
-        pass  # SVG is a nice-to-have
+    try:
+        fig.savefig(output_dir / f"{stem}_{theme}.svg", bbox_inches="tight")
+    except Exception as exc:  # noqa: BLE001
+        import logging
+        logging.debug("SVG export failed for %s: %s", stem, exc)

109-114: Bare Exception catch may mask unexpected errors.

The intent is to handle missing/malformed step data, but catching Exception could hide unrelated bugs. Consider catching more specific exceptions or at minimum adding a # noqa: BLE001 comment to acknowledge this is intentional.

     if steps is None:
         try:
             steps = int(metrics.select(pl.col("step").max()).item()) + 1
-        except Exception:
+        except (TypeError, ValueError, pl.exceptions.ComputeError):
             steps = None

286-290: Sort __all__ for consistency with isort conventions.

Static analysis flagged this as unsorted. Sorting maintains consistency with standard Python tooling.

 __all__ = [
+    "plot_agent_metrics",
     "plot_model_metrics",
-    "plot_agent_metrics",
     "plot_performance",
 ]
examples/sugarscape_ig/backend_mesa/model.py (2)

57-72: Docstring mismatch: function signature differs from similar functions.

The docstring for gini doesn't follow NumPy style as required by the coding guidelines. Consider adding Parameters/Returns sections for consistency with public APIs.

 def gini(values: Iterable[float]) -> float:
+    """Compute the Gini coefficient from an iterable of wealth values.
+
+    Parameters
+    ----------
+    values : Iterable[float]
+        Iterable of wealth/sugar values.
+
+    Returns
+    -------
+    float
+        Gini coefficient in [0, 1], or 0.0 for zero-total/constant values,
+        or NaN for empty input.
+    """
     array = np.fromiter(values, dtype=float)

132-134: Unused lambda parameter m – consider using _ convention.

The seed reporter lambda captures seed from the enclosing scope but declares an unused parameter m. Using _ makes the intent explicit.

-                "seed": lambda m: seed,
+                "seed": lambda _: seed,
examples/boltzmann_wealth/backend_mesa.py (2)

24-40: Consider extracting gini to a shared module to reduce duplication.

This gini function is identical to the one in examples/sugarscape_ig/backend_mesa/model.py (lines 57-72). Extracting it to examples/utils.py would improve maintainability.

You could add to examples/utils.py:

def gini(values: Iterable[float]) -> float:
    """Compute the Gini coefficient from an iterable of wealth values."""
    array = np.fromiter(values, dtype=float)
    if array.size == 0:
        return float("nan")
    if np.allclose(array, 0.0):
        return 0.0
    if np.allclose(array, array[0]):
        return 0.0
    sorted_vals = np.sort(array)
    n = sorted_vals.size
    cumulative = np.cumsum(sorted_vals)
    total = cumulative[-1]
    if total == 0:
        return 0.0
    index = np.arange(1, n + 1, dtype=float)
    return float((2.0 * np.dot(index, sorted_vals) / (n * total)) - (n + 1) / n)

Then import from both backends.


74-77: Unused lambda parameter m – use _ convention.

Same issue as in the Sugarscape model.

             model_reporters={
                 "gini": lambda m: gini(a.wealth for a in m.agent_list),
-                "seed": lambda m: seed,
+                "seed": lambda _: seed,
             }
benchmarks/cli.py (1)

97-100: Range generation logic could be simplified.

The current approach generates range(start, stop + step, step) then pops if the last value exceeds stop. This is correct but less idiomatic than using range(start, stop + 1, step) directly for inclusive ranges.

-        counts = list(range(start, stop + step, step))
-        if counts[-1] > stop:
-            counts.pop()
-        return counts
+        return list(range(start, stop + 1, step))

This produces the same result for an inclusive [start, stop] range with the given step.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3037456 and aadee32.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (18)
  • .gitignore (1 hunks)
  • benchmarks/README.md (1 hunks)
  • benchmarks/cli.py (1 hunks)
  • examples/README.md (1 hunks)
  • examples/__init__.py (1 hunks)
  • examples/boltzmann_wealth/README.md (1 hunks)
  • examples/boltzmann_wealth/backend_frames.py (1 hunks)
  • examples/boltzmann_wealth/backend_mesa.py (1 hunks)
  • examples/plotting.py (1 hunks)
  • examples/sugarscape_ig/README.md (1 hunks)
  • examples/sugarscape_ig/backend_frames/__init__.py (1 hunks)
  • examples/sugarscape_ig/backend_frames/agents.py (1 hunks)
  • examples/sugarscape_ig/backend_frames/model.py (1 hunks)
  • examples/sugarscape_ig/backend_mesa/__init__.py (1 hunks)
  • examples/sugarscape_ig/backend_mesa/agents.py (1 hunks)
  • examples/sugarscape_ig/backend_mesa/model.py (1 hunks)
  • examples/utils.py (1 hunks)
  • pyproject.toml (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Use Python 3.11 or later with 4-space indentation
Include type hints for all public APIs
Use NumPy-style docstrings (validated by Ruff/pydoclint)
Use CamelCase for class names
Use snake_case for function and attribute names
Avoid using TYPE_CHECKING guards for type annotations because the project uses beartype for runtime type checking, which requires the actual type objects to be available at runtime
Treat underscored attributes as internal/private and not part of the public API

Files:

  • examples/utils.py
  • examples/__init__.py
  • examples/sugarscape_ig/backend_mesa/agents.py
  • examples/plotting.py
  • examples/sugarscape_ig/backend_mesa/__init__.py
  • examples/boltzmann_wealth/backend_frames.py
  • examples/sugarscape_ig/backend_mesa/model.py
  • examples/sugarscape_ig/backend_frames/agents.py
  • examples/sugarscape_ig/backend_frames/model.py
  • examples/boltzmann_wealth/backend_mesa.py
  • examples/sugarscape_ig/backend_frames/__init__.py
  • benchmarks/cli.py
🧠 Learnings (3)
📚 Learning: 2025-12-08T18:41:11.772Z
Learnt from: CR
Repo: projectmesa/mesa-frames PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-08T18:41:11.772Z
Learning: Use `examples/` directory for reproducible demo models and performance scripts

Applied to files:

  • examples/README.md
  • examples/__init__.py
📚 Learning: 2025-12-08T18:41:11.772Z
Learnt from: CR
Repo: projectmesa/mesa-frames PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-08T18:41:11.772Z
Learning: Use `mesa_frames/` as the source package directory, with `abstract/` and `concrete/` subdirectories for core APIs and implementations, and key modules: `agents.py`, `agentset.py`, `space.py`, `datacollector.py`, `types_.py`

Applied to files:

  • examples/README.md
  • benchmarks/README.md
  • examples/utils.py
  • examples/sugarscape_ig/README.md
  • examples/sugarscape_ig/backend_mesa/agents.py
  • examples/sugarscape_ig/backend_mesa/__init__.py
  • examples/boltzmann_wealth/backend_frames.py
  • examples/sugarscape_ig/backend_mesa/model.py
  • examples/sugarscape_ig/backend_frames/agents.py
  • examples/sugarscape_ig/backend_frames/model.py
  • examples/sugarscape_ig/backend_frames/__init__.py
📚 Learning: 2025-04-29T09:25:34.183Z
Learnt from: adamamer20
Repo: projectmesa/mesa-frames PR: 143
File: mesa_frames/abstract/space.py:50-63
Timestamp: 2025-04-29T09:25:34.183Z
Learning: The project mesa-frames has been upgraded to Python 3.11, which provides native support for `Self` type in the standard typing module, eliminating the need for imports from typing_extensions.

Applied to files:

  • examples/utils.py
  • examples/sugarscape_ig/backend_frames/__init__.py
🧬 Code graph analysis (7)
examples/utils.py (1)
mesa_frames/concrete/datacollector.py (1)
  • DataCollector (69-640)
examples/sugarscape_ig/backend_mesa/agents.py (3)
examples/sugarscape_ig/backend_frames/model.py (2)
  • Sugarscape (174-386)
  • step (333-350)
examples/sugarscape_ig/backend_mesa/model.py (2)
  • Sugarscape (75-187)
  • step (172-181)
examples/sugarscape_ig/backend_frames/agents.py (3)
  • step (55-71)
  • move (73-80)
  • move (131-185)
examples/boltzmann_wealth/backend_frames.py (3)
mesa_frames/concrete/datacollector.py (1)
  • DataCollector (69-640)
examples/utils.py (1)
  • FramesSimulationResult (7-14)
examples/boltzmann_wealth/backend_mesa.py (1)
  • gini (24-40)
examples/sugarscape_ig/backend_frames/agents.py (3)
mesa_frames/concrete/agentset.py (1)
  • AgentSet (76-686)
mesa_frames/concrete/model.py (1)
  • Model (54-223)
examples/sugarscape_ig/backend_mesa/agents.py (1)
  • move (75-78)
examples/sugarscape_ig/backend_frames/model.py (3)
examples/utils.py (1)
  • FramesSimulationResult (7-14)
examples/plotting.py (1)
  • plot_model_metrics (81-171)
mesa_frames/abstract/datacollector.py (2)
  • collect (118-128)
  • flush (176-196)
examples/boltzmann_wealth/backend_mesa.py (4)
examples/utils.py (1)
  • MesaSimulationResult (18-25)
examples/plotting.py (1)
  • plot_model_metrics (81-171)
examples/boltzmann_wealth/backend_frames.py (3)
  • gini (24-42)
  • step (52-76)
  • step (106-108)
examples/sugarscape_ig/backend_frames/model.py (2)
  • gini (30-71)
  • step (333-350)
benchmarks/cli.py (3)
examples/plotting.py (1)
  • plot_performance (234-283)
examples/boltzmann_wealth/backend_frames.py (5)
  • simulate (115-124)
  • step (52-76)
  • step (106-108)
  • run (110-112)
  • run (131-186)
examples/boltzmann_wealth/backend_mesa.py (5)
  • simulate (92-97)
  • step (50-57)
  • step (81-85)
  • run (87-89)
  • run (104-177)
🪛 LanguageTool
examples/boltzmann_wealth/README.md

[grammar] ~57-~57: Use a hyphen to join words.
Context: ...rics - --results-dir Override the auto timestamped directory under results/ ...

(QB_NEW_EN_HYPHEN)

examples/sugarscape_ig/README.md

[grammar] ~50-~50: Use a hyphen to join words.
Context: ...alive` typically decreases until a quasi steady state (metabolism vs regrowth) or...

(QB_NEW_EN_HYPHEN)

🪛 Ruff (0.14.7)
examples/sugarscape_ig/backend_mesa/agents.py

21-21: Undefined name Sugarscape

(F821)

examples/plotting.py

73-74: try-except-pass detected, consider logging the exception

(S110)


73-73: Do not catch blind exception: Exception

(BLE001)


113-113: Do not catch blind exception: Exception

(BLE001)


273-273: Comment contains ambiguous (EN DASH). Did you mean - (HYPHEN-MINUS)?

(RUF003)


286-290: __all__ is not sorted

Apply an isort-style sorting to __all__

(RUF022)

examples/sugarscape_ig/backend_mesa/model.py

134-134: Unused lambda argument: m

(ARG005)


260-260: Local variable seed_val is assigned to but never used

Remove assignment to unused variable seed_val

(F841)

examples/sugarscape_ig/backend_frames/agents.py

50-52: Avoid specifying long messages outside the exception class

(TRY003)


311-311: Comment contains ambiguous (NON-BREAKING HYPHEN). Did you mean - (HYPHEN-MINUS)?

(RUF003)

examples/sugarscape_ig/backend_frames/model.py

227-229: Avoid specifying long messages outside the exception class

(TRY003)

examples/boltzmann_wealth/backend_mesa.py

76-76: Unused lambda argument: m

(ARG005)


138-138: Comment contains ambiguous (NON-BREAKING HYPHEN). Did you mean - (HYPHEN-MINUS)?

(RUF003)


143-143: Comment contains ambiguous (NON-BREAKING HYPHEN). Did you mean - (HYPHEN-MINUS)?

(RUF003)

benchmarks/cli.py

86-86: Avoid specifying long messages outside the exception class

(TRY003)


90-90: Avoid specifying long messages outside the exception class

(TRY003)


92-92: Avoid specifying long messages outside the exception class

(TRY003)


94-94: Avoid specifying long messages outside the exception class

(TRY003)


96-96: Avoid specifying long messages outside the exception class

(TRY003)


104-104: Avoid specifying long messages outside the exception class

(TRY003)


106-106: Avoid specifying long messages outside the exception class

(TRY003)


126-126: Avoid specifying long messages outside the exception class

(TRY003)


129-129: Avoid specifying long messages outside the exception class

(TRY003)


197-197: Do not perform function call in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable

(B008)

🔇 Additional comments (43)
.gitignore (1)

158-162: Ignore patterns updated appropriately for nested results/plots output.

The glob patterns now correctly accommodate timestamped subdirectories (benchmarks/results/TIMESTAMP/... and examples/*/results/TIMESTAMP/...) and plot outputs, aligning with the new CLI-driven output structure described in the READMEs.

examples/README.md (1)

1-106: Well-structured and comprehensive examples documentation.

The README clearly explains the dual-backend approach, CLI interface, data formats, and extension patterns. The quick start, tips, and programmatic use examples are practical and well-documented.

pyproject.toml (1)

66-85: Typer dependency added appropriately to docs group.

The placement in the docs dependency group aligns with using Typer for CLI tooling in examples and benchmarks. The version constraint >=0.9.0 is reasonable and not overly restrictive.

Please confirm that the examples and benchmarks code actually imports and uses Typer as documented in the READMEs (e.g., benchmarks/cli.py, examples/boltzmann_wealth/backend_frames.py), since those files are not visible in this review context.

examples/sugarscape_ig/backend_mesa/__init__.py (1)

1-1: Appropriate minimal package initializer.

The module docstring correctly identifies the package purpose. No additional code or exports are needed at this stage.

benchmarks/README.md (1)

1-88: Comprehensive and well-structured benchmarking documentation.

The README clearly explains purpose, CLI usage, output formats, and extension patterns. The table of CLI options and CSV schema documentation are particularly helpful for reproducibility and post-processing.

examples/sugarscape_ig/backend_frames/__init__.py (1)

1-1: Appropriate minimal package initializer.

The module docstring correctly identifies the Frames backend package. No additional code or exports are needed at this stage.

examples/sugarscape_ig/backend_mesa/agents.py (3)

34-47: LGTM!

The _visible_cells method correctly implements cardinal direction visibility with proper boundary checks.


49-73: LGTM!

The _choose_best_cell method correctly implements the Sugarscape movement rule with proper tie-breaking by sugar amount, Manhattan distance, and lexicographic coordinates.


75-78: LGTM!

The move method is clean and delegates appropriately to the helper methods.

examples/utils.py (1)

6-25: LGTM!

The dataclasses provide clean containers for simulation results with appropriate type hints. The docstrings could be updated to reference datacollector.data instead of metrics/agent_metrics for clarity, but this is a minor documentation nit.

examples/sugarscape_ig/backend_frames/agents.py (7)

30-53: LGTM!

Good defensive validation of required trait columns with a clear error message. Using clone() on the input DataFrame prevents unintended side effects.


55-71: LGTM!

The step sequence (shuffle → move → eat → remove starved) correctly implements the Sugarscape update order as documented.


82-116: LGTM!

The eat method correctly implements vectorised sugar harvesting with proper updates to both agent sugar stocks and cell sugar values.


118-127: LGTM!

Clean vectorised removal of starved agents.


131-185: LGTM!

The move method is well-structured with clear early exits and proper delegation to helper methods. The inline schema comments enhance readability.


187-248: LGTM!

The _build_neighborhood_frame method correctly assembles the neighborhood data with proper joins and null handling.


348-620: LGTM!

The conflict resolution algorithm is well-implemented with proper fallback handling. The iterative lottery approach with rank promotion ensures all agents are eventually assigned, either to a preferred cell or their origin. The inline schema comments significantly aid comprehension.

examples/sugarscape_ig/backend_frames/model.py (4)

30-71: LGTM with minor redundancy.

The function is robust with multiple safety checks. Lines 60-65 contain redundant size checks (already verified at line 58), but this defensive approach is acceptable for example code.


74-140: LGTM!

Both correlation reporters are correctly implemented with appropriate null checks. The similar structure between them is acceptable for readability in example code.


143-171: LGTM!

The _safe_corr helper correctly handles edge cases (insufficient data, constant values) that would make Pearson correlation undefined.


174-386: LGTM!

The Sugarscape class is well-structured with:

  • Proper validation of agent count vs grid capacity
  • Clean separation of grid generation, agent creation, and data collection
  • Correct instant-growback sugar regrowth logic
  • Appropriate termination handling when agents die out
examples/boltzmann_wealth/backend_frames.py (4)

24-42: LGTM!

The gini function correctly handles edge cases and matches the pattern from the Mesa backend implementation.


45-76: LGTM!

The MoneyAgents class correctly implements the Boltzmann wealth exchange using vectorised Polars operations. The sampling uses the model RNG for reproducibility.


79-112: LGTM!

The MoneyModel class correctly integrates the agent set with data collection, with appropriate storage backend selection based on results_dir.


115-124: LGTM!

Clean simulation function that correctly wires the model and returns the result container.

examples/plotting.py (5)

17-44: LGTM! Theme configuration is well-structured.

The light/dark theme definitions with consistent styling (disabled top/right spines, appropriate colors) provide a good foundation for the plotting utilities.


47-56: LGTM! Seed shortening helper is correctly implemented.

The regex-based approach handles edge cases (None, no match) and truncates long seeds appropriately.


81-171: LGTM! plot_model_metrics implementation is comprehensive.

The function correctly handles:

  • Empty dataframes (early return)
  • Missing step column (auto-generation)
  • Single vs multiple metrics (legend removal, y-axis labeling)
  • Title composition with optional N/T metadata
  • Multi-theme output

177-228: LGTM! plot_agent_metrics handles id variable detection well.

The fallback to the first column when preferred id vars aren't present is a sensible default.


234-283: LGTM! plot_performance implementation is solid.

The function correctly uses seaborn's estimator="mean" with errorbar="sd" for aggregating repeated measurements, and applies appropriate legend styling per theme.

examples/sugarscape_ig/backend_mesa/model.py (5)

31-42: LGTM! _safe_corr handles edge cases correctly.

The function properly handles degenerate inputs (too few elements, constant arrays) by returning NaN, matching the documented Frames helper behavior.


45-54: Forward reference Sugarscape requires string annotation or import reordering.

Sugarscape is used as a type hint before the class is defined. While from __future__ import annotations (line 9) enables postponed evaluation making this work, ensure this import remains at the top.


75-144: LGTM! Sugarscape model initialization is well-structured.

The model correctly:

  • Generates a seed if not provided
  • Initializes sugar fields with NumPy RNG
  • Places agents on empty cells
  • Sets up comprehensive DataCollector reporters

148-181: LGTM! Step scheduling logic matches documented order.

The move → eat/starve → regrow → collect sequence matches the docstring and tutorial schedule. The survivor filtering in _harvest_and_survive is correctly implemented.


209-311: LGTM! CLI implementation is comprehensive and consistent.

The Typer CLI follows the same patterns as the Boltzmann example:

  • Timestamped output directories
  • Optional CSV persistence and plotting
  • Per-metric plot generation
  • Clear console output
examples/boltzmann_wealth/backend_mesa.py (3)

43-57: LGTM! MoneyAgent implementation is correct.

The agent correctly:

  • Initializes with 1 unit of wealth
  • Guards against transferring with zero/negative wealth
  • Randomly selects a recipient from the agent list

60-90: LGTM! MoneyModel implementation is clean.

The model correctly initializes agents, sets up the DataCollector, and implements the shuffled step execution.


103-177: LGTM! CLI implementation follows consistent patterns.

The run command properly:

  • Creates timestamped output directories
  • Measures and reports runtime
  • Extracts and displays metrics
  • Saves CSV and generates plots conditionally
benchmarks/cli.py (5)

27-34: LGTM! Runner protocol is well-defined.

The RunnerP protocol cleanly abstracts the runner interface for both Mesa and Frames backends.


43-78: LGTM! Model configuration with lambda wrappers is appropriate.

The lambda wrappers for Sugarscape runners correctly adapt the width/height requirements to match the simpler RunnerP signature. The grid sizing formula (max(20, ceil(sqrt(agents)) * 2)) ensures adequate space.


200-205: LGTM! Runtime type-checking warning is helpful.

Good practice to warn users that beartype runtime checking can significantly impact benchmark results.


213-244: LGTM! Benchmark loop implementation is correct.

The nested loop correctly iterates over models, agent counts, repeats, and backends, recording all relevant metadata for each run.


246-262: LGTM! Output handling is well-organized.

The unified timestamp directory structure with per-model CSVs and a plots subfolder is clean and maintainable.

@adamamer20 adamamer20 added the examples additions or modifications to sample models label Dec 8, 2025
@adamamer20 adamamer20 added this to the 0.2.0 milestone Dec 8, 2025
@adamamer20 adamamer20 requested review from Ben-geo and EwoutH December 8, 2025 23:29
@EwoutH
Copy link
Member

EwoutH commented Dec 9, 2025

@Ben-geo can you take the lead in reviewing this?

@Ben-geo
Copy link
Collaborator

Ben-geo commented Dec 9, 2025

Yep, I’ve got it! - I’ll review all the new PRs this week.

I’ll start going through some tonight, and I’ll wrap up by the weekend.

Copy link
Collaborator

@Ben-geo Ben-geo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!
ran it locally as well

@adamamer20
Copy link
Member Author

@coderabbitai review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 15, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (7)
examples/sugarscape_ig/backend_frames/model.py (1)

58-71: Redundant size checks in gini() function.

The function checks sugar.size == 0 at line 60-61, then immediately calls np.sort() and checks n == 0 again at line 64-65. Since n = sorted_vals.size equals sugar.size, the second check is unreachable.

     sugar = primary_set.df["sugar"].to_numpy().astype(np.float64)
-
-    if sugar.size == 0:
-        return float("nan")
-    sorted_vals = np.sort(sugar.astype(np.float64))
+    sorted_vals = np.sort(sugar)
     n = sorted_vals.size
     if n == 0:
         return float("nan")
examples/plotting.py (2)

73-77: Consider logging SVG export failures instead of silently passing.

Silently swallowing exceptions can make debugging difficult. Even for "nice-to-have" features, logging the failure helps identify environment issues.

+import logging
+
+_logger = logging.getLogger(__name__)
+
 def _finalize_and_save(fig: Figure, output_dir: Path, stem: str, theme: str) -> None:
     """Tight layout with space for suptitle, export PNG + (optional) SVG."""
     output_dir.mkdir(parents=True, exist_ok=True)
     fig.tight_layout(rect=[0, 0, 1, 0.94])
     png = output_dir / f"{stem}_{theme}.png"
     fig.savefig(png, dpi=300)
     try:
         fig.savefig(output_dir / f"{stem}_{theme}.svg", bbox_inches="tight")
-    except Exception:
-        pass  # SVG is a nice-to-have
+    except Exception as exc:
+        _logger.debug("SVG export failed for %s: %s", stem, exc)
     plt.close(fig)

302-306: Optional: Sort __all__ alphabetically.

Static analysis suggests sorting __all__ for consistency (RUF022). This is a minor style preference.

 __all__ = [
+    "plot_agent_metrics",
     "plot_model_metrics",
-    "plot_agent_metrics",
     "plot_performance",
 ]
examples/boltzmann_wealth/backend_mesa.py (2)

104-146: Variable seed shadows the function parameter.

Line 146 reassigns seed = model_pd["seed"].iloc[0], shadowing the seed parameter from line 104. While this works correctly (the extracted seed equals the passed seed or auto-generated one), the shadowing can be confusing. Consider renaming to extracted_seed or similar.

-    seed = model_pd["seed"].iloc[0]
+    extracted_seed = model_pd["seed"].iloc[0]
     model_pd = model_pd[["step", "gini"]]
 
     # Show a short tail in console for quick inspection
     ...
             subtitle=f"mesa backend; seed={seed}",
+            # or use extracted_seed if you want the value from the DataFrame

135-140: Replace non-breaking hyphens with regular hyphens in comments.

Static analysis detected ambiguous non-breaking hyphens (U+2011) in comments at lines 135 and 140. These could cause issues if copy-pasted.

-    # Run simulation (Mesa‑idiomatic): we only use DataCollector's public API
+    # Run simulation (Mesa-idiomatic): we only use DataCollector's public API
     result = simulate(agents=agents, steps=steps, seed=seed)
     typer.echo(f"Simulation completed in {perf_counter() - start_time:.3f} seconds")
     dc = result.datacollector
 
-    # ---- Extract metrics (no helper, no monkey‑patch):
+    # ---- Extract metrics (no helper, no monkey-patch):
benchmarks/cli.py (2)

27-41: Widen RunnerP return type to match actual backend simulate functions

The concrete runners (e.g. boltzmann_* .simulate, sugarscape_* .simulate) return result objects, but RunnerP is declared as returning None. Static type-checkers will flag Backend.runner=...simulate assignments against this protocol.

Since the CLI ignores the return value, you can safely widen the protocol to avoid spurious type errors:

class RunnerP(Protocol):
-    def __call__(self, agents: int, steps: int, seed: int | None = None) -> None: ...
+    def __call__(self, agents: int, steps: int, seed: int | None = None) -> object: ...

This keeps the runtime behavior identical while staying consistent with the existing backends.


81-101: Agent range parsing looks solid; optional whitespace handling improvement

The range parsing logic (validation, inclusive stop, and start = 0 support for init‑time benchmarks) looks correct and robust.

If you want to make the CLI a bit more forgiving, you could accept whitespace around colons (e.g. --agents "0: 500: 100"):

-        parts = value.split(":")
+        parts = [part.strip() for part in value.split(":")]
         if len(parts) != 3:
             raise typer.BadParameter("Ranges must use start:stop:step format")
         try:
             start, stop, step = (int(part) for part in parts)

Pure ergonomics; current behavior is already fine.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between aadee32 and 6e5bfbb.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (11)
  • benchmarks/README.md (1 hunks)
  • benchmarks/cli.py (1 hunks)
  • examples/boltzmann_wealth/README.md (1 hunks)
  • examples/boltzmann_wealth/backend_frames.py (1 hunks)
  • examples/boltzmann_wealth/backend_mesa.py (1 hunks)
  • examples/plotting.py (1 hunks)
  • examples/sugarscape_ig/README.md (1 hunks)
  • examples/sugarscape_ig/backend_frames/model.py (1 hunks)
  • examples/sugarscape_ig/backend_mesa/agents.py (1 hunks)
  • examples/sugarscape_ig/backend_mesa/model.py (1 hunks)
  • pyproject.toml (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • benchmarks/README.md
🚧 Files skipped from review as they are similar to previous changes (4)
  • examples/sugarscape_ig/backend_mesa/agents.py
  • pyproject.toml
  • examples/sugarscape_ig/README.md
  • examples/boltzmann_wealth/backend_frames.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Use Python 3.11 or later with 4-space indentation
Include type hints for all public APIs
Use NumPy-style docstrings (validated by Ruff/pydoclint)
Use CamelCase for class names
Use snake_case for function and attribute names
Avoid using TYPE_CHECKING guards for type annotations because the project uses beartype for runtime type checking, which requires the actual type objects to be available at runtime
Treat underscored attributes as internal/private and not part of the public API

Files:

  • examples/sugarscape_ig/backend_frames/model.py
  • examples/plotting.py
  • examples/sugarscape_ig/backend_mesa/model.py
  • examples/boltzmann_wealth/backend_mesa.py
  • benchmarks/cli.py
🧬 Code graph analysis (3)
examples/sugarscape_ig/backend_frames/model.py (3)
examples/utils.py (1)
  • FramesSimulationResult (7-14)
examples/plotting.py (1)
  • plot_model_metrics (83-182)
mesa_frames/abstract/datacollector.py (2)
  • collect (118-128)
  • flush (176-196)
examples/sugarscape_ig/backend_mesa/model.py (3)
examples/utils.py (1)
  • MesaSimulationResult (18-25)
examples/plotting.py (1)
  • plot_model_metrics (83-182)
examples/sugarscape_ig/backend_mesa/agents.py (2)
  • AntAgent (16-78)
  • move (75-78)
benchmarks/cli.py (5)
examples/plotting.py (1)
  • plot_performance (250-299)
examples/boltzmann_wealth/backend_mesa.py (5)
  • simulate (89-94)
  • step (47-54)
  • step (78-82)
  • run (84-86)
  • run (101-174)
examples/sugarscape_ig/backend_mesa/model.py (4)
  • simulate (189-202)
  • step (171-180)
  • run (182-186)
  • run (209-308)
examples/boltzmann_wealth/backend_frames.py (5)
  • simulate (115-124)
  • step (52-76)
  • step (106-108)
  • run (110-112)
  • run (131-184)
examples/sugarscape_ig/backend_frames/model.py (4)
  • simulate (389-409)
  • step (333-350)
  • run (352-365)
  • run (416-496)
🪛 Ruff (0.14.8)
examples/sugarscape_ig/backend_frames/model.py

227-229: Avoid specifying long messages outside the exception class

(TRY003)

examples/plotting.py

75-76: try-except-pass detected, consider logging the exception

(S110)


75-75: Do not catch blind exception: Exception

(BLE001)


115-115: Do not catch blind exception: Exception

(BLE001)


289-289: Comment contains ambiguous (EN DASH). Did you mean - (HYPHEN-MINUS)?

(RUF003)


302-306: __all__ is not sorted

Apply an isort-style sorting to __all__

(RUF022)

examples/sugarscape_ig/backend_mesa/model.py

133-133: Unused lambda argument: m

(ARG005)

examples/boltzmann_wealth/backend_mesa.py

73-73: Unused lambda argument: m

(ARG005)


135-135: Comment contains ambiguous (NON-BREAKING HYPHEN). Did you mean - (HYPHEN-MINUS)?

(RUF003)


140-140: Comment contains ambiguous (NON-BREAKING HYPHEN). Did you mean - (HYPHEN-MINUS)?

(RUF003)

benchmarks/cli.py

86-86: Avoid specifying long messages outside the exception class

(TRY003)


90-90: Avoid specifying long messages outside the exception class

(TRY003)


92-92: Avoid specifying long messages outside the exception class

(TRY003)


95-95: Avoid specifying long messages outside the exception class

(TRY003)


97-97: Avoid specifying long messages outside the exception class

(TRY003)


105-105: Avoid specifying long messages outside the exception class

(TRY003)


107-107: Avoid specifying long messages outside the exception class

(TRY003)


127-127: Avoid specifying long messages outside the exception class

(TRY003)


130-130: Avoid specifying long messages outside the exception class

(TRY003)

🔇 Additional comments (9)
examples/sugarscape_ig/backend_frames/model.py (2)

463-465: Past review issue addressed: CSV confirmation message now correctly scoped.

The typer.echo(f"Saved CSV results under {results_dir}") is now inside the if save_results: block, fixing the misleading message issue flagged in the previous review.


174-279: Sugarscape model implementation looks correct.

The model properly wires the grid, agent sets, and data collector. The conditional storage backend selection (memory vs CSV based on results_dir) is a good pattern for supporting both benchmarking and CLI use cases.

examples/plotting.py (1)

82-183: Well-designed model metrics plotting with theme support.

The function handles edge cases (empty data, missing step column), automatically infers step count, and generates consistent light/dark themed outputs. The legend handling for single vs. multiple metrics is a nice touch.

examples/sugarscape_ig/backend_mesa/model.py (2)

130-134: Unused lambda argument m is intentional for DataCollector API.

The static analysis flags lambda m: seed as having an unused argument, but this is required by Mesa's DataCollector API which passes the model to all reporter lambdas. No change needed.


74-143: Sugarscape Mesa model implementation is correct.

The model properly initializes the sugar field, grid, and agents with reproducible RNG seeding. The DataCollector setup mirrors the Frames backend for CSV comparability.

examples/boltzmann_wealth/backend_mesa.py (2)

21-37: Gini coefficient implementation is robust.

The function handles edge cases well: empty arrays, all-zero values, and uniform distributions all return appropriate values (nan or 0.0). The formula implementation is correct.


40-55: MoneyAgent step logic is correct.

The agent correctly skips if wealth is 0, selects a random peer, and transfers one unit. The if other is None check at line 51-52 is defensive but random.choice on a non-empty list won't return None, so this is safe.

benchmarks/cli.py (2)

111-138: Model selection parsing is clear and robust

_parse_models nicely handles "all", comma-separated lists, validation against MODELS, and order-preserving de-duplication. I don’t see any changes needed here.


141-160: Consistent reuse of shared plotting utilities

The _plot_performance wrapper is minimal and does the right thing: early‑exits on empty data, selects the expected columns, and delegates to the shared examples.plotting.plot_performance with a clear stem and title. Looks good.

Comment on lines 163 to 272
@app.command()
def run(
models: Annotated[
str | list[str],
typer.Option(
help="Models to benchmark: boltzmann, sugarscape, or all",
callback=_parse_models,
),
] = "all",
agents: Annotated[
str | list[int],
typer.Option(
help="Agent count or range (start:stop:step)", callback=_parse_agents
),
] = "1000:5000:1000",
steps: Annotated[
int,
typer.Option(
min=0,
help="Number of steps per run.",
),
] = 100,
repeats: Annotated[int, typer.Option(help="Repeats per configuration.", min=1)] = 1,
seed: Annotated[int, typer.Option(help="Optional RNG seed.")] = 42,
save: Annotated[bool, typer.Option(help="Persist benchmark CSV results.")] = True,
plot: Annotated[bool, typer.Option(help="Render performance plots.")] = True,
results_dir: Annotated[
Path | None,
typer.Option(
help=(
"Base directory for benchmark outputs. A timestamped subdirectory "
"(e.g. results/20250101_120000) is created with CSV files at the root "
"and a 'plots/' subfolder for images. Defaults to the module's results directory."
),
),
] = None,
) -> None:
"""Run performance benchmarks for the selected models."""
# Support both CLI (via callbacks) and direct function calls
if isinstance(models, str):
models = _parse_models(models)
if isinstance(agents, str):
agents = _parse_agents(agents)
# Ensure module-relative default is computed at call time (avoids import-time side effects)
if results_dir is None:
results_dir = Path(__file__).resolve().parent / "results"

runtime_typechecking = os.environ.get("MESA_FRAMES_RUNTIME_TYPECHECKING", "")
if runtime_typechecking and runtime_typechecking.lower() not in {"0", "false"}:
typer.secho(
"Warning: MESA_FRAMES_RUNTIME_TYPECHECKING is enabled; benchmarks may run significantly slower.",
fg=typer.colors.YELLOW,
)
rows: list[dict[str, object]] = []
# Single timestamp per CLI invocation so all model results are co-located.
timestamp = datetime.now(timezone.utc).strftime("%Y%m%d_%H%M%S")
# Create unified output layout: <results_dir>/<timestamp>/{CSV files, plots/}
base_results_dir = results_dir
timestamp_dir = (base_results_dir / timestamp).resolve()
plots_subdir: Path = timestamp_dir / "plots"
for model in models:
config = MODELS[model]
typer.echo(f"Benchmarking {model} with agents {agents}")
for agents_count in agents:
for repeat_idx in range(repeats):
run_seed = seed + repeat_idx
for backend in config.backends:
start = perf_counter()
backend.runner(agents_count, steps, run_seed)
runtime = perf_counter() - start
rows.append(
{
"model": model,
"backend": backend.name,
"agents": agents_count,
"steps": steps,
"seed": run_seed,
"repeat_idx": repeat_idx,
"runtime_seconds": runtime,
"timestamp": timestamp,
}
)
# Report completion of this run to the CLI
typer.echo(
f"Completed {backend.name} for model={model} agents={agents_count} steps={steps} seed={run_seed} repeat={repeat_idx} in {runtime:.3f}s"
)
# Finished all runs for this model
typer.echo(f"Finished benchmarking model {model}")

if not rows:
typer.echo("No benchmark data collected.")
return
df = pl.DataFrame(rows)
if save:
timestamp_dir.mkdir(parents=True, exist_ok=True)
for model in models:
model_df = df.filter(pl.col("model") == model)
csv_path = timestamp_dir / f"{model}_perf_{timestamp}.csv"
model_df.write_csv(csv_path)
typer.echo(f"Saved {model} results to {csv_path}")
if plot:
plots_subdir.mkdir(parents=True, exist_ok=True)
for model in models:
model_df = df.filter(pl.col("model") == model)
_plot_performance(model_df, model, plots_subdir, timestamp)
typer.echo(f"Saved {model} plots under {plots_subdir}")

typer.echo(
f"Unified benchmark outputs written under {timestamp_dir} (CSV files) and {plots_subdir} (plots)"
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Clarify final summary message based on save / plot flags

The core benchmarking loop and IO layout look good. One minor UX issue: the final message always claims that CSVs and plots were written under {timestamp_dir} / {plots_subdir}, even when save=False and/or plot=False.

To avoid misleading output, you could tailor the summary to what was actually produced:

-    typer.echo(
-        f"Unified benchmark outputs written under {timestamp_dir} (CSV files) and {plots_subdir} (plots)"
-    )
+    destinations: list[str] = []
+    if save:
+        destinations.append(f"CSVs under {timestamp_dir}")
+    if plot:
+        destinations.append(f"plots under {plots_subdir}")
+
+    if destinations:
+        typer.echo("Unified benchmark outputs written: " + "; ".join(destinations))
+    else:
+        typer.echo("Benchmark run completed (save=False, plot=False; no files written).")

This keeps the behavior unchanged while making the CLI output more accurate in all flag combinations.

🤖 Prompt for AI Agents
In benchmarks/cli.py around lines 163 to 272, the final summary always claims
CSVs and plots were written even when save=False and/or plot=False; change the
final message to reflect only what was actually produced by building a
conditional summary: if save True include the CSV timestamp_dir path, if plot
True include the plots_subdir path, otherwise note that CSVs or plots were not
generated; then print a single concise summary string that lists produced
artifacts (or states none were produced) instead of always mentioning both.

Comment on lines 1 to 96
# Boltzmann Wealth Exchange Model

## Overview

This example implements a simple wealth exchange ("Boltzmann money") model in two
backends:

- `backend_frames.py` (Mesa Frames / vectorised `AgentSet`)
- `backend_mesa.py` (classic Mesa / object-per-agent)

Both expose a Typer CLI with symmetric options so you can compare correctness
and performance directly.

## Concept

Each agent starts with 1 unit of wealth. At every step:

1. Frames backend: all agents with strictly positive wealth become potential donors.
Each donor gives 1 unit of wealth, and a recipient is drawn (with replacement)
for every donating agent. A single vectorised update applies donor losses and
recipient gains.
2. Mesa backend: agents are shuffled and iterate sequentially; each agent with
positive wealth transfers 1 unit to a randomly selected peer.

The stochastic exchange process leads to an emergent, increasingly unequal
wealth distribution and rising Gini coefficient, typically approaching a stable
level below 1 (due to conservation and continued mixing).

## Reported Metrics

The model records per-step population Gini (`gini`). You can extend reporters by
adding lambdas to `model_reporters` in either backend's constructor.

Notes on interpretation:

- Early steps: Gini ~ 0 (uniform initial wealth).
- Mid phase: Increasing Gini as random exchanges concentrate wealth.
- Late phase: Fluctuating plateau (a stochastic steady state) — exact level
varies with agent count and RNG seed.

## Running

Always run examples from the project root using `uv`:

```bash
uv run examples/boltzmann_wealth/backend_frames.py --agents 5000 --steps 200 --seed 123 --plot --save-results
uv run examples/boltzmann_wealth/backend_mesa.py --agents 5000 --steps 200 --seed 123 --plot --save-results
```

## CLI options

- `--agents` Number of agents (default 5000)
- `--steps` Simulation steps (default 100)
- `--seed` Optional RNG seed for reproducibility
- `--plot / --no-plot` Generate line plot(s) of Gini
- `--save-results / --no-save-results` Persist CSV metrics
- `--results-dir` Override the auto-timestamped directory under `results/`

Frames backend additionally warns if runtime type checking is enabled because it
slows vectorised operations: set `MESA_FRAMES_RUNTIME_TYPECHECKING=0` for fair
performance comparisons.

## Outputs

Each run creates (or uses) a results directory like:

```text
examples/boltzmann_wealth/results/20251016_173702/
model.csv # step,gini
gini_<timestamp>_dark.png (and possibly other theme variants)
```

Tail metrics are printed to console for quick inspection:

```text
Metrics in the final 5 steps: shape: (5, 2)
┌──────┬───────┐
│ step ┆ gini │
│ --- ┆ --- │
│ i64 ┆ f64 │
├──────┼───────┤
│ ... ┆ ... │
└──────┴───────┘
```

## Performance & Benchmarking

Use the shared benchmarking CLI to compare scaling, checkout `benchmarks/README.md`.

## Programmatic Use

```python
from examples.boltzmann_wealth import backend_frames as bw_frames
result = bw_frames.simulate(agents=10000, steps=250, seed=42)
metrics = result.datacollector.data["model"] # Polars DataFrame
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Documentation is well-structured and comprehensive.

The README provides clear explanations of the model concept, CLI usage, output structure, and programmatic API. One minor grammar fix:

Line 88: "checkout" should be "check out" (verb phrase).

-Use the shared benchmarking CLI to compare scaling, checkout `benchmarks/README.md`.
+Use the shared benchmarking CLI to compare scaling, check out `benchmarks/README.md`.
🤖 Prompt for AI Agents
In examples/boltzmann_wealth/README.md around lines 1–96 (specifically line 88),
the phrase "checkout `benchmarks/README.md`" is incorrect; change it to the verb
phrase "check out `benchmarks/README.md`". Update the sentence in the
Performance & Benchmarking section to use "check out" and run a quick
spell/grammar check across the README to ensure no other similar
verb/compound-word mistakes remain.

Comment on lines 256 to 258
# Keep the full model metrics (step + any model reporters)
# Keep the full model metrics (step + any model reporters)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Remove duplicate comment line.

Line 257 is a duplicate of line 256.

     # Extract metrics using DataCollector API
     model_pd = (
         dc.get_model_vars_dataframe().reset_index().rename(columns={"index": "step"})
     )
     # Keep the full model metrics (step + any model reporters)
-    # Keep the full model metrics (step + any model reporters)
🤖 Prompt for AI Agents
In examples/sugarscape_ig/backend_mesa/model.py around lines 256 to 258, there
is a duplicated comment line ("Keep the full model metrics (step + any model
reporters)") on line 257; remove the duplicate line 257 so only a single
instance of that comment remains.

Comment on lines 306 to 308
typer.echo(f"Saved plots under {plots_dir}")

typer.echo(f"Saved CSV results under {results_dir}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Misleading "Saved CSV results" message when save_results=False.

Line 308 unconditionally prints "Saved CSV results under {results_dir}" regardless of whether save_results is True. This should be moved inside the if save_results: block (lines 268-270) to match the backend_frames fix.

     if save_results:
         csv_path = results_dir / "model.csv"
         model_pd.to_csv(csv_path, index=False)
+        typer.echo(f"Saved CSV results under {results_dir}")
 
     # Plot per-metric similar to the backend_frames example...
     ...
         typer.echo(f"Saved plots under {plots_dir}")
-
-    typer.echo(f"Saved CSV results under {results_dir}")

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In examples/sugarscape_ig/backend_mesa/model.py around lines 306-308, the
message "Saved CSV results under {results_dir}" is printed unconditionally even
when save_results is False; move this typer.echo call inside the existing if
save_results: block (the block around lines 268-270 that handles saving CSV
results) so the CSV-saved log is only emitted when results are actually written,
mirroring the backend_frames fix.

@adamamer20 adamamer20 merged commit 29fa011 into main Dec 15, 2025
14 checks passed
@adamamer20 adamamer20 deleted the split/examples-benchmarks branch December 15, 2025 20:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

examples additions or modifications to sample models

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants