Skip to content

Simulation mini PR#7

Open
OmerKfir19695 wants to merge 21 commits into
mainfrom
simulation_mini
Open

Simulation mini PR#7
OmerKfir19695 wants to merge 21 commits into
mainfrom
simulation_mini

Conversation

@OmerKfir19695
Copy link
Copy Markdown
Collaborator

@OmerKfir19695 OmerKfir19695 commented Mar 18, 2026

This PR is (again) a big one, so I'd you to focus on main things:

1.) see that All of the project and tests runs on you computer:
Run the following:
conda activate bees_env
python BEES.py -i projects/Glycolysis/input.yml,

a. Run the input as it is.
b. Cut the output and save it somewhere elso, Make two more runs but change the input file settings objects: once change MoveToCore number ( greater than 1) and in other run change concentrions and simulation time. before each run cut and paste the output outside of the repo. ( DONT FORGET- saving the input file before run again and remove the older output folder :) )
c. In each run you should get an output with the following file:

General

output.log: main execution log
output_errors.log: error log (only meaningful if errors happened)
reactions_summary.txt: summary of generated reactions
input.yml: validated input snapshot

Iterative mode only (when end_time is set and toleranceMoveToCore > 0)

Also inside projects//output/:
ode_equations_iterN.txt: written per iteration (N = iteration number)
simulation_profiles.csv: concentration time-series if save_simulation_profiles: true
flux_analysis.csv: per-iteration core/edge counts history
flux_analysis_reactions.csv: detailed reaction membership / iteration history (this is in code)
simulation_plot_iterN.png: per-iteration concentration plot if save_simulation_plots: true (default true in schema)
reaction_tree_iterN.png: per-iteration reaction tree plot if save_reaction_tree_plots: true

d. After this one, pls look for 2 more example fron the project and repet a-c

e. Test check :
run make test

  1. give me a feedback on the following files ( I will appriciate any comment you like even small review is will be great...):
    bee/
    enlarger.py
    model generator.py
    core_edge_model.py
    flux_calculator.py
    simulator.py

Much appriciate the help so far and look forword to review

Comment thread tests/test_enlarger.py Fixed
Comment thread tests/test_model_generator.py Fixed
Comment thread tests/test_reaction_utils.py Dismissed
Comment thread tests/test_simulator.py Fixed
Comment thread bees/enlarger.py Fixed
Comment thread tests/test_simulator.py Fixed
Implements a core/edge model framework for dynamic network expansion. This approach, inspired by RMG, integrates ODE simulation, flux analysis, and targeted reaction generation.

Reduces kinetic estimation cost by only discovering reactions for species exceeding a flux promotion threshold.

Introduces:
- `CoreEdgeModel` to manage core and edge species/reactions.
- `IterativeEnlarger` orchestrating the iterative loop, including species promotion and pruning.
- `FluxCalculator` for characteristic rate determination and significance assessment.
- `ODESimulator` to perform full model (core + edge) integrations.
Note: 
Edge species promotion logic to use the peak absolute flux over the entire simulation duration instead of the final time point.
This ensures that species with significant but transient flux are also considered for promotion, aligning with the RMG approach of considering "flux at some point."


feat output: 
1- simulation profiles, 
2. detailed flux analysis
3. Reaction tree plots per each iteration 
4. simulation plot per iteration
"""

from unittest.mock import MagicMock
import pytest
Comment thread tests/test_enlarger.py Fixed
Comment thread bees/enlarger.py Fixed
Comment thread bees/enlarger.py Fixed
Comment thread bees/exporter.py Fixed
Comment thread db/reaction_database.py Fixed
Comment thread db/reaction_database.py Fixed
@OmerKfir19695 OmerKfir19695 force-pushed the simulation_mini branch 3 times, most recently from 0feaf5a to 0a2e8a6 Compare April 28, 2026 12:09
Introduces trajectory-based characteristic rate (R_char) and edge flux metrics, including peak values over the simulation and R_char at final time.

-add trajectory-aware characteristic-rate (R_char) and edge flux metrics to drive enlargement decisions over simulation time
-implement interrupt-based enlargement triggers when edge significance crosses dynamic thresholds, following RMG-style flux logic
-update promotion/pruning rules to use peak/relative flux information with safeguards against premature removal of newly discovered species
-refactor enlarger, simulator, and flux-calculation integration to improve convergence clarity, maintainability, and runtime efficiency

For the coredegemodel:


- Refactored docstrings in the SpeciesData and CoreEdgeModel classes for better readability and understanding of attributes and methods.
- Changed import statement from model_generator to reaction_generator for consistency.
- Introduced a new private attribute to track known reaction signatures, enhancing duplicate reaction handling.
- Added a new method, prune_edge, to remove species from the edge with negligible flux, improving edge management.
- Updated concentration reset logic to ensure all species return to their initial values after simulation interruptions.
Improves reaction construction by integrating comprehensive species canonicalization,
heavy atom balance checks, and refined cofactor availability.

Some major changes:

- Introduces a species registry that unifies labels based on SMILES and
  ontology equivalents, ensuring consistent naming in generated reactions.
- Implements heavy atom balance validation to filter out chemically unsound
  reactions derived from database stoichiometry or ontology merges.
- Refines cofactor availability, distinguishing between "always available"
  species (e.g., H2O, H+) and high-energy carriers (e.g., ATP, NAD(P)H)
  that must be explicitly provided or generated.
- Enhances EC template-based reaction generation and cofactor product inference.
- Centralizes kinetics estimator initialization and improves setup guidance.
- Strengthens reaction deduplication by using canonicalized labels.

-change the name of model_generator to reaction_generator

fore the kinetic estimator there was much to change so: 

- Introduced in-process memoization to cache results of expensive CatPred calls, improving performance for repeated estimations.
- Enhanced symlink creation logic to handle environments that restrict symlink creation, falling back to file copying when necessary.
- Added optional environment variables for specifying the conda binary and Python executable, allowing for more flexible execution configurations.
- Updated documentation within the class to clarify parameters and methods, improving code readability and usability.
Expands/adjusts ontology and DB utilities needed by reaction generation and availability checks.

the most important update was stop couting just on strings, I add more then 16K SMILES based on PUBCHEM General database
Comment thread bees/simulator.py
"""Full-span solve_ivp without flux-interrupt events."""
species_labels = self.model.get_all_species_labels()
n_species = len(species_labels)
label_to_idx = {lab.lower().strip(): i for i, lab in enumerate(species_labels)}
Comment thread bees/simulator.py
np.array(self.model.get_all_concentration_vector(), dtype=float)
)
reactions = self.model.core_reactions + self.model.edge_reactions
edge_labels_lc = {sp.label.lower().strip() for sp in self.model.edge_species}
Comment thread bees/schema.py Fixed
Comment thread bees/exporter.py Fixed
Comment thread bees/kinetics_estimator.py Fixed
Comment thread bees/kinetics_estimator.py Fixed
Integrates the iterative enlargement pipeline into the main BEES execution, enabling rate-based reaction network expansion and simulation. The system now dynamically selects between a new 'iterative' mode and a 'batch' reaction network generation mode.

Updates the settings schema to expose granular controls for iterative refinement parameters, simulation profile saving, and various plotting functionalities. This includes adding `rdkit` based utilities for robust SMILES handling.

fix: Removes deprecated schema fields, placeholder comments, and redundant validation logic, streamlining the input configuration and improving code clarity.
Adds simulation interruption and pruning options

Introduces new configuration parameters to provide more granular control over simulation behavior.

Allows defining a tolerance-based condition to interrupt the simulation, offering more flexible termination criteria. Additionally, provides configurable thresholds for minimum edge iterations and core species to fine-tune the pruning logic, ensuring it occurs under desired conditions.

refactor: update input file paths and enhance environment checks

- Changed the input file path in the BEES.py script from `~/BEES/examples/minimal/input.yml` to `~/BEES/projects/minimal/input.yml`.
- Moved the environment variable loading to occur before importing BEES modules to ensure proper resolution of class-level environment variables.
- Added a warning function to notify users if they are not running within the 'bees_env' environment, improving user experience.
- Refactored the main execution function to include the new warning check and simplified output messages for clarity.
- Updated imports in the `__init__.py` file to reflect changes in module structure, replacing `model_generator` with `reaction_generator`.
- Cleaned up the `common.py` file by removing unused cofactor substitution definitions and enhancing the ontology equivalent retrieval logic.
…eadme, and project outputs

-Introduces a comprehensive Glycolysis example which is the based model that I worked on and make my improvement . all the other file setting generate and work but less explored as this one.


-Update project-level input templates and example output artifacts to match current iterative enlargement behaviour and generated files. it's include also include the tolerances, the SBML file exporter and the graphs and more features 

- README - rewrite the main README to better explain installation, execution, CatPred setup, and troubleshooting

- refresh project documentation structure (project examples and descriptions) for easier onboarding

-.gitignore: clean up tracked/ignored generated artifacts so repository docs and examples stay reproducible and consistent
Introduces a new project configuration for E. coli Type II Fatty Acid Synthase (FAS II), including detailed specifications for enzymes and species based on experimental data.

Renames the existing generic `FattyAcidSynthesisDemo` project to `FattyAcidSynthesis_genral_mammalian` for improved clarity and differentiation from specific models.
- Introduces a comprehensive E. coli (K12) specific reaction database (ecoli.csv) containing over 2,300 curated reactions.
*Preoteome data where taking from uniport https://www.uniprot.org/proteomes/UP000000625
convert from TSV to CSV format
* The metabolome dataset where taking from The ECMDB
- https://ecmdb.ca/
Some of the missing SMILES was fill using SMILES from PUBCHEM or ChEBI.

The old database is now will be define as general database
important for the exporter.py
Implements functionality to export various results from the iterative enlarger.

- Exports detailed flux analysis, including core/edge growth summaries and reaction history.
- Generates CSVs for core and edge reactions and their associated species.
- Provides time-series concentration data from all simulation iterations in CSV format.
- Creates visual reaction tree plots, highlighting species promoted to the core in each iteration, with Graphviz layout support.
- Generates simulation concentration plots, allowing for exclusion of enzymes and cofactors, to visualize concentration dynamics over time.

it's important to say this module nor perfect, but good for POC and visulization of the basic results.
Introduces fully configured E. coli projects for Glycolysis, leveraging the new curated ecoli database.
Introduces comprehensive unit tests for key components of the iterative simulation and network enlargement process.

New test suites validate the `CoreEdgeModel`, `IterativeEnlarger`, `ODESimulator`, `flux_calculator` functions, `kinetics_estimator`, and various `model_generator` and `reaction_utils` functions. This ensures the correctness and robustness of the new simulation and reaction network expansion logic.

Also refactors main,py and schema.py tests to align with the streamlined input parsing and configuration for this new workflow and features

Introduces a `toleranceMoveEdgeReactionToCore` setting and simulation logic. This enables dynamic core expansion based on a dlnaccum criterion, aligning with established methods for efficient mechanism generation.

add new unit test suites covering CoreEdgeModel, IterativeEnlarger, ODESimulator, flux_calculator, kinetics_estimator, and key model_generator / reaction_utils utilities
Refactor main.py and schema.py tests to match streamlined input parsing and the new iterative configuration surface
Validate enlargement fallbacks and settings, including toleranceMoveEdgeReactionToCore (dlnaccum-based edge→core promotion) and new ODE solver controls / iteration limits
Remove deprecated configuration paths and tests (e.g. filter_reactions) and simplify schema expectations (Species/Enzyme fields, multi-EC support)
Add regression coverage for ontology alias handling via get_ontology_equivalents (including label permutations)
Improve robustness tests for interrupt-driven promotion, model resets, pruning/significance handling, and exporter decoupling (EnlargerExporter)
enlarger.py: when an ODE interrupt fires but no edge species can be promoted
(e.g. all dlnaccum-flagged reaction participants are already core), resume
the simulation from the interrupt point rather than terminating. This prevents
high-dlnaccum reactions involving only core species (such as the PlsB acyl
transfer) from prematurely ending an enlargement iteration before genuine
edge-species flux has had time to accumulate.
Also adds projects/fattyAcidSynthesis/fattyAcidSynthesis_ecoli_branched/input.yml:
FAS II + PlsB/PlsC (phospholipid branch), AccAB/AccBC (Acetyl-CoA carboxylase
feedback), FadB/FadA (beta-oxidation competition), and hexadecanoyl-CoA as an
initial species to seed the correct PlsB forward reaction from iteration 1.

The main perpose is to see what the core edge worked for non-linear pathways
@OmerKfir19695 OmerKfir19695 force-pushed the simulation_mini branch 3 times, most recently from d9c0f6b to e8e4f42 Compare May 10, 2026 07:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant