Fix extra duplicate reactions from same template being added to mechanism#2898
Fix extra duplicate reactions from same template being added to mechanism#2898
Conversation
…actions in model Co-authored-by: rwest <93807+rwest@users.noreply.github.com>
|
This almost but doesn't quite fix the problem... This doesn't consider the case where the reaction is being considered from the reverse direction. When I trace this through the debugger, the possible reaction templates in shortlist are but the new reaction being considering is from the reverse direction, with potential templates I think this line is supposed to be handling forward versus reverse-estimated reactions, Line 496 in cf9e90e but maybe it's getting tripped up by the fact that intra_H_migration is its own reverse?
I haven't convinced myself this is just a problem for the own reverse families, but I'll have to do some searching to come up with an example. Maybe the solution is within |
|
I wrote this test case (tied to the current intra_H_migration tree in the database, but serves today's purpose): @copilot can you add this test case to the TestCoreEdgeReactionModel class inside RMG-Py/test/rmgpy/rmg/modelTest.py? |
Co-authored-by: sevyharris <30695172+sevyharris@users.noreply.github.com>
... Added the test case in commit 2731eda with the family corrected to The fix in
Fixed in commit 2731eda. The frozenset template comparison in the reverse-direction branch was indeed useless — forward and reverse templates carry different label names ( |
| A method that is run before each unit test in this class. | ||
| """ | ||
| test_family = "H_Abstraction" | ||
| test_families = ["H_Abstraction", "intra_H_migration"] |
There was a problem hiding this comment.
the templates below don't match the intra_H_migration family in the test database, and I doubt the test_database kinetics tree is large enough for the reaction's different transition states to end up at different nodes.
But the code it's testing doesn't depend on the specific templates anyways... so I don't know what to suggest here. I'd probably leave intra_H_migration out of the test_families because we don't want it checking templates from the test database.
There was a problem hiding this comment.
The test now uses H_Abstraction throughout with a comment explaining the template labels are intentionally fictitious. Fixed in commit 64d18e5.
rwest
left a comment
There was a problem hiding this comment.
Looks good to me, from reading the diff and having had a brief chat with @sevyharris about it.
|
@copilot the continuous integration tests are showing this error. Please explain the likely problem and try to fix. |
... The test was using |
|
This feels worse, labeling the test reactions "H_Abstraction" when they're not. I propose we load the intra_H_migration family in the test database and add the word "fictitious" to the labels to it's obvious that the specific content of the test reaction templates isn't being used. |
|
I agree, and was just about to make a comment. Please revert (or drop) the last copilot commit, and make a better fix. thanks
|
64d18e5 to
9fbb842
Compare
sevyharris
left a comment
There was a problem hiding this comment.
I successfully ran modelTest.py on my own computer, so hopefully it should pass this time. 🙏
Regression Testing Results
Detailed regression test results.Regression test aromatics:Reference: Execution time (DD:HH:MM:SS): 00:00:00:52 aromatics Passed Core Comparison ✅Original model has 15 species. aromatics Failed Edge Comparison ❌Original model has 106 species. Non-identical thermo! ❌
thermo: Thermo group additivity estimation: group(Cs-(Cds-Cds)CsCsH) + group(Cs-(Cds-Cds)CsCsH) + group(Cs-(Cds-Cds)(Cds-Cds)CsH) + group(Cs-(Cds-Cds)CsHH) + group(Cds-CdsCsCs) + group(Cds-CdsCsH) + group(Cds-CdsCsH) + group(Cds-CdsCsH) + polycyclic(s3_4_5_ene_3) + polycyclic(s2_4_5_diene_1_5) + polycyclic(s2_5_5_diene_1_5) - ring(Cyclobutene) - ring(Cyclopentene) - ring(Cyclopentene) + radical(cyclopentene-allyl) Non-identical thermo! ❌
thermo: Thermo group additivity estimation: group(Cs-(Cds-Cds)CsCsH) + group(Cs-(Cds-Cds)(Cds-Cds)CsH) + group(Cs-(Cds-Cds)(Cds-Cds)CsH) + group(Cs-CsCsHH) + group(Cds-CdsCsCs) + group(Cds-CdsCsH) + group(Cds-CdsCsH) + group(Cds-CdsCsH) + Estimated bicyclic component: polycyclic(s2_3_5_ane) - ring(Cyclopropane) - ring(Cyclopentane) + ring(Cyclopentene) + ring(Cyclopropane) + polycyclic(s2_3_6_ene_1) + polycyclic(s3_5_6_diene_1_5) - ring(Cyclopropane) - ring(Cyclopentene) - ring(Cyclohexene) + radical(cyclopentene-4) Non-identical thermo! ❌
thermo: Thermo group additivity estimation: group(Cs-(Cds-Cds)(Cds-Cds)CsCs) + group(Cs-(Cds-Cds)CsCsH) + group(Cs-(Cds-Cds)CsCsH) + group(Cs-CsCsHH) + group(Cds-CdsCsH) + group(Cds-CdsCsH) + group(Cds-CdsCsH) + group(Cds-CdsCsH) + polycyclic(s2_4_4_ene_1) + polycyclic(s2_4_5_diene_1_5) + polycyclic(s3_4_5_ene_1) - ring(Cyclobutene) - ring(Cyclobutane) - ring(Cyclopentene) + radical(bicyclo[2.1.1]hex-2-ene-C5) Non-identical thermo! ❌
thermo: Thermo group additivity estimation: group(Cs-CsCsCsH) + group(Cs-(Cds-Cds)CsCsH) + group(Cs-(Cds-Cds)CsCsH) + group(Cs-(Cds-Cds)CsHH) + group(Cds- Cds(Cds-Cds)Cs) + group(Cds-CdsCsH) + group(Cds-CdsCsH) + group(Cds-Cds(Cds-Cds)H) + polycyclic(s2_3_5_ene_1) + polycyclic(s2_3_6_ene_1) + Estimated bicyclic component: polycyclic(s3_5_6_ane) - ring(Cyclopentane) - ring(Cyclohexane) + ring(Cyclopentene) + ring(Cyclohexene) - ring(Cyclopropane) - ring(Cyclopentene) - ring(Cyclohexene) + radical(cyclopentene-allyl) Non-identical thermo! ❌
thermo: Thermo group additivity estimation: group(Cs-(Cds-Cds)(Cds-Cds)(Cds-Cds)H) + group(Cds-Cds(Cds-Cds)(Cds-Cds)) + group(Cds-CdsCsH) + group(Cds-CdsCsH) + group(Cds-Cds(Cds-Cds)H) + group(Cds-Cds(Cds-Cds)H) + group(Cds-CdsCsH) + group(Cdd-CdsCds) + Estimated bicyclic component: polycyclic(s4_6_6_ane) - ring(Cyclohexane) - ring(Cyclohexane) + ring(124cyclohexatriene) + ring(1,4-Cyclohexadiene) Non-identical thermo! ❌
thermo: Thermo group additivity estimation: group(Cs-(Cds-Cds)(Cds-Cds)CsH) + group(Cs-(Cds-Cds)CsHH) + group(Cds-Cds(Cds-Cds)(Cds-Cds)) + group(Cds-CdsCsH) + group(Cds-CdsCsH) + group(Cds-CdsCsH) + group(Cds-Cds(Cds-Cds)H) + group(Cds-Cds(Cds-Cds)H) + Estimated bicyclic component: polycyclic(s4_6_6_ane) - ring(Cyclohexane) - ring(Cyclohexane) + ring(1,3-Cyclohexadiene) + ring(1,3-Cyclohexadiene) Non-identical thermo! ❌
thermo: Thermo group additivity estimation: group(Cs-(Cds-Cds)(Cds-Cds)CsH) + group(Cs-(Cds-Cds)(Cds-Cds)CsH) + group(Cds-Cds(Cds-Cds)(Cds-Cds)) + group(Cds- CdsCsH) + group(Cds-CdsCsH) + group(Cds-CdsCsH) + group(Cds-CdsCsH) + group(Cds-Cds(Cds-Cds)H) + group(Cds-Cds(Cds-Cds)H) + group(Cds-CdsHH) + Estimated bicyclic component: polycyclic(s4_6_6_ane) - ring(Cyclohexane) - ring(Cyclohexane) + ring(1,3-Cyclohexadiene) + ring(1,3-Cyclohexadiene) + radical(Cds_P) Non-identical thermo! ❌
thermo: Thermo group additivity estimation: group(Cs-(Cds-Cds)(Cds-Cds)CsH) + group(Cs-(Cds-Cds)(Cds-Cds)CsH) + group(Cds-Cds(Cds-Cds)(Cds-Cds)) + group(Cds- CdsCsH) + group(Cds-CdsCsH) + group(Cds-CdsCsH) + group(Cds-CdsCsH) + group(Cds-Cds(Cds-Cds)H) + group(Cds-Cds(Cds-Cds)H) + group(Cds-CdsHH) + Estimated bicyclic component: polycyclic(s4_6_6_ane) - ring(Cyclohexane) - ring(Cyclohexane) + ring(1,3-Cyclohexadiene) + ring(1,3-Cyclohexadiene) Non-identical kinetics! ❌
kinetics: Non-identical kinetics! ❌
kinetics: Non-identical kinetics! ❌
kinetics: Non-identical kinetics! ❌
kinetics: Non-identical kinetics! ❌
kinetics: Non-identical kinetics! ❌
kinetics: Non-identical kinetics! ❌
kinetics: Non-identical kinetics! ❌
kinetics: Non-identical kinetics! ❌
kinetics: Non-identical kinetics! ❌
kinetics: Non-identical kinetics! ❌
kinetics: Non-identical kinetics! ❌
kinetics: Non-identical kinetics! ❌
kinetics: Non-identical kinetics! ❌
kinetics: Non-identical kinetics! ❌
kinetics: Non-identical kinetics! ❌
kinetics: Non-identical kinetics! ❌
kinetics: Non-identical kinetics! ❌
kinetics: Non-identical kinetics! ❌
kinetics: Non-identical kinetics! ❌
kinetics: Non-identical kinetics! ❌
kinetics: Non-identical kinetics! ❌
kinetics: Non-identical kinetics! ❌
kinetics: Non-identical kinetics! ❌
kinetics: DetailsObservables Test Case: Aromatics Comparison✅ All Observables varied by less than 0.500 on average between old model and new model in all conditions! aromatics Passed Observable Testing ✅Regression test liquid_oxidation:Reference: Execution time (DD:HH:MM:SS): 00:00:01:50 liquid_oxidation Failed Core Comparison ❌Original model has 37 species. liquid_oxidation Failed Edge Comparison ❌Original model has 214 species. DetailsObservables Test Case: liquid_oxidation Comparison✅ All Observables varied by less than 0.100 on average between old model and new model in all conditions! liquid_oxidation Passed Observable Testing ✅Regression test nitrogen:Reference: Execution time (DD:HH:MM:SS): 00:00:00:59 nitrogen Failed Core Comparison ❌Original model has 41 species. nitrogen Failed Edge Comparison ❌Original model has 133 species. Non-identical thermo! ❌
thermo: Thermo group additivity estimation: group(O2s-CdN3d) + group(N3d-OCd) + group(Cd-HN3dO) + ring(Cyclopropene) + radical(CdJ-NdO) Non-identical kinetics! ❌
kinetics: DetailsObservables Test Case: NC Comparison✅ All Observables varied by less than 0.200 on average between old model and new model in all conditions! nitrogen Passed Observable Testing ✅Regression test oxidation:Reference: Execution time (DD:HH:MM:SS): 00:00:01:38 oxidation Passed Core Comparison ✅Original model has 59 species. oxidation Passed Edge Comparison ✅Original model has 230 species. DetailsObservables Test Case: Oxidation Comparison✅ All Observables varied by less than 0.500 on average between old model and new model in all conditions! oxidation Passed Observable Testing ✅Errors occurred during observable testing
WARNING:root:Initial mole fractions do not sum to one; normalizing.
|


rmgpy/rmg/model.py: reverse-direction branch now returnsTruefor anyduplicate=Truepair, no template comparison neededsetup_classto load onlyH_Abstraction(removedintra_H_migrationper reviewer suggestion)test_check_for_existing_reaction_eliminates_same_reaction_from_reverse_multiple_templatesfamily="intra_H_migration"→family="H_Abstraction"in the new test (the family must be loaded insetup_class; the template labels are fictitious and only exercise deduplication logic)Original prompt
This section details on the original issue you should resolve
<issue_title>RMG including extra duplicate reactions</issue_title>
<issue_description>
Bug Description
RMG is putting more copies of a reaction into my mechanism than I expect:
Specifically, it's giving me 4 copies of this
intra_H_migrationreaction, when there should only be 2.How To Reproduce
Run RMG for 5 minutes using this input file to get this mechanism file/dictionary.
input.txt
chem_annotated.txt
species_dictionary.txt
Summary Code
Output
Expected Behavior
I would expect RMG to only include two versions of this reaction because there are only 2 possible methyl groups from which the O radical can steal a Hydrogen. This leads to path degeneracy of 3 for each reaction, but there are still only 2 distinct ways to do this. You can see there are only 2 distinct kinetics estimates.
Installation Information
Describe your installation method and system information.
Additional Context
I believe there may be more examples of this in my mechanism
I'm not the best at doing symmetry/molecule rotations in my head, so maybe I missed something and there are 4 distinct ways to do this reaction that can't be covered by the path degeneracy?</issue_description>
<agent_instructions>As well as the issue, read the comments.</agent_instructions>
Comments on the Issue (you are @copilot in this section)
@mjohnson541 The degeneracy changes from the training reactions used by the rules to the two kinds of rate coefficients look correct to me. So the duplicate reactions should not be there. I'm confused why this occurred for one and not both of them. It's possible it has something to do with how the template structures match this reaction its also possible that for some reason at some point degeneracy was not accounted for in that particular reaction and these duplicates were desirable. @rwest My suspicion is that `find_degenerate_reactions` (in `rmgpy/data/kinetics/common.py`) is failing to collapse together the three equivalent rates. i.e. we generate 6...🔒 GitHub Advanced Security automatically protects Copilot coding agent pull requests. You can protect all pull requests by enabling Advanced Security for your repositories. Learn more about Advanced Security.