Feat/stop on modes conflict#267
Conversation
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 6c16eeb. Configure here.
| f"run_{required}() requires '{required}' mode. " | ||
| f"Initialize RapidFire with `{init_cmd}`, then restart services " | ||
| f"(`rapidfireai stop && rapidfireai start`)." | ||
| ) |
There was a problem hiding this comment.
Missing mode file blocks fit
Medium Severity
When rf_mode.txt is absent, get_installed_mode() returns None and run_fit() is stopped, but setup/start.sh still starts the fit dispatcher by default. Fit notebooks can be blocked even though services are already in fit mode.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit 6c16eeb. Configure here.
| print(f"Using {cfg['num_actors']} actors, {cfg['gpus_per_actor']} GPUs per actor, {cfg['cpus_per_actor']} CPUs per actor") | ||
| print( | ||
| f"Using {cfg['num_actors']} actors, {cfg['gpus_per_actor']} GPUs per actor, {cfg['cpus_per_actor']} CPUs per actor" | ||
| ) |
There was a problem hiding this comment.
Mode check runs too late
Medium Severity
Installed-mode validation runs only inside run_fit() and run_evals(), after Experiment.__init__ has already run _init_fit_mode() or _init_evals_mode() (including ray.init). A mode mismatch is detected only after heavy setup, not when the conflict is already knowable.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit 6c16eeb. Configure here.


Changes
Changelog Content
Additions
Changes
Fixes
Testing
Screenshots (if applicable)
Add screenshots to help explain your changes.
Checklist
Performance Impact
If this PR affects performance, describe the impact and any optimizations made.
Related Issues
Fixes #(issue number)
Closes #(issue number)
Related to #(issue number)
Note
Medium Risk
Changes entry points for training and evals to fail fast on mode mismatch; low security impact but users with cross-mode notebooks will see new hard errors until they re-init.
Overview
Adds installed-mode validation so
run_fit()andrun_evals()stop before starting work when RapidFire was initialized for the other mode ($RF_HOME/rf_mode.txt). Mismatches previously left experiments running while the dispatcher/IC Ops pointed at the wrong DB and the control panel stayed disabled; failures now surface viadisplay_pretty_error()with re-init/restart instructions.Introduces
rapidfireai/utils/mode_utils.py(get_installed_mode,assert_mode_matches) and wiresdoctordiagnostics to the same reader instead of duplicating file logic. pytest coverage intests/test_mode_utils.pyplus a small standalonetest_mode_guard.pyscript.Remaining diff in
experiment.py/doctor.pyis mostly import ordering and formatting, not behavior.Reviewed by Cursor Bugbot for commit 1b6fccc. Bugbot is set up for automated code reviews on this repo. Configure here.