Skip to content

Conversation

@kdraeder
Copy link
Collaborator

@kdraeder kdraeder commented Nov 18, 2025

Several changes are needed in cam, which will have a PR with the same title.
I'll provide the PR number after it's assigned.

This was tested using
ERI_D_C3_CG_Ld8.ne30pg3_g17.F1850C_LTso.derecho_intel.cam-outfrq9s_leapday

CIME/SystemTests/eri.py

  • Changed the year increment from 2 to 4; all runs will be in leap years,
  • which have Feb 29, which is the start date for outfrq9s_leapday.

CIME/test_scheduler.py

  • Distinguish carefully between the _C# and _CG test modifiers so that _CG does not confuse it.

Description

The ERI test cannot use both the _CG (gregorian calendar) _C# (multi-instance) modifiers
because some conditionals refer only to _C. This PR makes the scripts be more careful
in using the modifiers, and other changes to make the test work for leap days.
It should require no changes to other uses of the ERI test.
It requires a few changes in CAM, which will be handled in a CAM PR and referenced here.

  • Closes testmodifier overdefined #4811

  • Will lead to baseline failures for ERI tests due to a year difference (so will get failures due to "no original counterpart" / "no compare counterpart")

Checklist

  • [y] My code follows the style guidlines of this proejct (black formatting)
  • [y] I have performed a self-review of my own code
  • [?] My changes generate no new warnings
  • [y] I have added tests that excerise my feature/fix and existing tests continue to pass
    I have not tried existing tests with this modified code. I don't know yet which to run.
  • [y] I have commented my code, particularly in hard-to-understand areas
  • [y] I have made corresponding additions and changes to the documentation
    I plan to submit documentation changes in a separate PR. They are somewhat broader than this PR.
    I'll add it to this PR if that's preferable.

Several changes are needed in cam,
which will have a commit with the same title.

This was tested using
ERI_D_C3_CG_Ld8.ne30pg3_g17.F1850C_LTso.derecho_intel.cam-outfrq9s_leapday

CIME/SystemTests/eri.py
   Changed the year increment from 2 to 4; all runs will be in leap years,
   which have Feb 29, which is the start date for outfrq9s_leapday.
CIME/test_scheduler.py
   Distinguish carefully between the _C# and _CG test modifiers
   so that _CG does not confuse it.
@codecov
Copy link

codecov bot commented Nov 18, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 31.44%. Comparing base (d2ddac6) to head (82493b1).
⚠️ Report is 12 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #4899   +/-   ##
=======================================
  Coverage   31.44%   31.44%           
=======================================
  Files         264      264           
  Lines       38703    38703           
  Branches     8390     8390           
=======================================
+ Hits        12169    12172    +3     
+ Misses      25292    25291    -1     
+ Partials     1242     1240    -2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@kdraeder
Copy link
Collaborator Author

The CAM PR is #1437

Copy link
Member

@billsacks billsacks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for these fixes, @kdraeder !

I have a couple of comments below - A requested change to the comment in the eri test, and a thought about further disambiguating the _C modifier (but I don't feel strongly about the latter).

I see that there are some formatting issues picked up by the pre-commit checks. The easiest way to fix these yourself is in the instructions here:

https://github.com/esmci/cime/wiki/Testing#install-pre-commit

but if you don't want to deal with figuring that out, I can run this for you and update your branch.

Comment on lines 681 to 682
elif case_opt.startswith("C") and not case_opt.startswith("CG"):
expect(ninst == 1, "Cannot combine _C# and _N options")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this fix, @kdraeder !

@kdraeder and @jedwards4b - What would you think about changing the CG modifier to instead be cG (lowercase G) to further remove some of this ambiguity? It doesn't look like we use any CG test modifiers in any CESM test list, so this shouldn't break anything... and it would be easier to change this now before we start getting tests with this modifier. This would be a departure from the current use of all uppercase as the first letter for a test modifier, so I don't love this inconsistency, but in my mind it's still better to have this disambiguation even if it means for some inconsistency in this respect. What do you both think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay by me.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would you think about changing the CG modifier to instead be cG (lowercase G)

I'm guessing that you meant "lower case 'C'" not 'G'.
I wondered about that possibility, but didn't know how to see whether it would be disruptive,
so thanks for looking into it!
I agree that it would be clearer to not use _C for 2 modifiers.
How about using just _G (currently unused) for gregorian and _C# for the ensemble size?
That would preserve the use of only capitol letters for modifiers too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, sorry, I meant lower case 'C', not 'G' - sorry for the confusion.

I also thought about using _G. I think that would be a reasonable option, though with these downsides / caveats:

  • I wondered if there might be a future desire to use some other calendar besides Gregorian; if so, that argues for keeping the "c" syntax rather than having completely separate options for each calendar. This was my main rationale that led me to the thought of keeping "_cG" rather than just "_G".
  • This line implies that "G" is a currently-accepted option, though from a quick look I can't see where it's used so maybe this is some historical code that isn't true anymore??? In any case, a little more digging should be done if we want to go with "_G":

or opt.startswith("G") # handled in create_newcase

In all, I still lean towards "_cG", but don't feel strongly about it. So, @kdraeder, @jedwards4b or anyone else - if you have even moderately strong opinions about it, I'm happy to go with your feelings.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doh, I missed that one.
I can see some value in being able to provide other calendars.
I don't have a deep background in what testing is or needs to be done,
so I don't have a strong opinion keeping the capital letter convention or adding modifiers.
A more thorough search leads me to believe that F, H, J, K, O, Q, S, T, U, W-Z are not being used.
_T for time? _K for kalender?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still vote for lowercase "_c": I'd prefer something more mnemonic even if it breaks with the current uppercase convention.

@jedwards4b , @jgfouca @jasonb5 - any opinions on using a lowercase _c as a test modifier to specify calendar, replacing the current uppercase _C, which is currently used for two different purposes? It would be the first use of a lowercase test option, though I personally feel okay about this precedent as we get more and more test modifiers and start running out of letters.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fine with me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kdraeder is it okay with you if I push a change to your branch to change to using _c as the test modifier for calendar?

Copy link
Collaborator Author

@kdraeder kdraeder Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kdraeder is it okay with you if I push a change to your branch to change to using _c as the test modifier for calendar?

Yes, that should work well.
In which case the line that tests for _C and not _CG can be reverted.

Comment on lines 93 to 94
# KDR The default start date is a leap day, so 2-29 2 years later doesn't exist; run failure.
# start_2_year = start_1_year + 2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy with this fix, but the comment is confusing, since I think it only applies to a specific case, not a general issue with the ERI test. What about changing it to something like:

Suggested change
# KDR The default start date is a leap day, so 2-29 2 years later doesn't exist; run failure.
# start_2_year = start_1_year + 2
# Change the year for the hybrid case to make sure the system can handle this
# change in year.
#
# Note: When using a Gregorian calendar, it can be important to have consistency
# between whether the two years are leap years or not because of how this test is
# set up; so change year by 4 so that these will generally be consistent.

(Note that I have also removed your initials and removed the commented-out old code.)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that comment is mostly left over from debugging.
Your suggestion looks good.
We could consider being more specific about "because of how this test is set up".
"because a testmod which tests the gregorian calendar starts the runs on leap day,
which must exist in all of the start years."

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like your updated wording - my ambiguity came from my not fully understanding the situation.

Can you change this wording to your suggested version on top of mine?

@jedwards4b
Copy link
Contributor

@kdraeder you can fix the remaining test issue by running
black CIME/test_scheduler.py
from the cime directory and commiting the result then push it back to this branch.
If you don't have black you can use my copy at /glade/u/home/jedwards/.local/bin/black

@kdraeder
Copy link
Collaborator Author

It looks like I'll need another piece or 2.

/glade/u/home/jedwards/.local/bin/black CIME/test_scheduler.py
Traceback (most recent call last):
File "/glade/u/home/jedwards/.local/bin/black", line 5, in
from black import patched_main
ModuleNotFoundError: No module named 'black'

module spider black
That's probably the wrong kind of module. Do I need a python module instead?
Lmod has detected the following error: Unable to find: "black".

@jedwards4b
Copy link
Contributor

I think that probably the best way to solve this is to go ahead and install pre-commit. Use pip or conda to install the pre-commit tool, then in your toplevel cime directory run pre-commit install. That should cause black to run automatically when you run git commit.

@kdraeder
Copy link
Collaborator Author

https://github.com/esmci/cime/wiki/Testing#install-pre-commit

> pip install pre_commit
installs several pieces, then gives me


Installing collected packages: distlib, platformdirs, nodeenv, identify, filelock, cfgv, virtualenv, pre_commit
ERROR: Could not install packages due to an OSError: [Errno 13] Permission denied: 
'/glade/u/apps/derecho/24.12/opt/._view/secv6rfp5i74illomkwzkkct42yomdmp/lib/python3.11/site-packages/distlib'
Check the permissions.

[notice] A new release of pip is available: 23.1.2 -> 25.3
[notice] To update, run: python3 -m pip install --upgrade pip

1[1522] derecho6:/<4>cesm3.0_testing_PRs/cime/CIME > python3 -m pip install --upgrade pip
Requirement already satisfied: pip in 
/glade/u/apps/derecho/24.12/opt/view/lib/python3.11/site-packages (23.1.2)
Collecting pip
  Downloading pip-25.3-py3-none-any.whl (1.8 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.8/1.8 MB 42.3 MB/s eta 0:00:00
Installing collected packages: pip
  Attempting uninstall: pip
    Found existing installation: pip 23.1.2
    Uninstalling pip-23.1.2:
ERROR: Could not install packages due to an OSError: [Errno 13] Permission denied: 
'/glade/u/apps/derecho/24.12/opt/._view/secv6rfp5i74illomkwzkkct42yomdmp/bin/pip'
Check the permissions.

@jedwards4b
Copy link
Contributor

Oh yea, the git commit will fail the first time but it should show that it black reformats the file.
Then run git commit again and it should take. Then push the new commit to your branch.

@jedwards4b
Copy link
Contributor

You need to use: pip install --user

--user Install to the Python user install directory for your platform. Typically ~/.local/, or %APPDATA%\Python on Windows. (See the Python documentation
for site.USER_BASE for full details.)

@kdraeder
Copy link
Collaborator Author

Use pip or conda to install the pre-commit tool,
Pip didn't work, and neither does conda.
> conda install -c conda-forge pre_commit

EnvironmentNotWritableError: The current user does not have write permissions to the target environment.
  environment location: /glade/u/apps/opt/conda
  uid: 4241
  gid: 69284

@kdraeder
Copy link
Collaborator Author

You need to use: pip install --user

--user Install to the Python user install directory for your platform. Typically ~/.local/, or %APPDATA%\Python on Windows. (See the Python documentation for site.USER_BASE for full details.)

I'm sorry that I don't have more competence with python.
site.USER_BASE seems to be a path name variable in a python module.
The documentation seems to tell me that I can set it by environment variable PYTHONUSERBASE.

> setenv PYTHONUSERBASE ~/.local
> ls $PYTHONUSERBASE
etc lib share
> pip install --user pre_commit
ERROR: Can not perform a '--user' install. User site-packages are not visible in this virtualenv.

Since my changes are so limited, maybe I should skip using a tool to fix the formatting
and just look at a document that tells me how it should look.
But I suppose it will be useful to be able to use black in the future . . .

@billsacks
Copy link
Member

just look at a document that tells me how it should look.

I don't think such a document exists... the guidance at this point for CIME, as for many python projects, is "just run black and use whatever formatting it provides". But if you don't want to go down the rabbit hole of getting it set up this time, I can run it for you once you are done making other changes.

@kdraeder
Copy link
Collaborator Author

just look at a document that tells me how it should look.

the guidance at this point for CIME, as for many python projects, is "just run black and use whatever formatting it provides".

@jedwards4b has approved this PR without me running black, but I should still be able to use it for future PRs.
Should I open an issue with the help desk, or do you have a path for me to follow to set it up?

@jedwards4b
Copy link
Contributor

I think that your attempt yesterday to install pre-commit was affected by the github outage. Please try again today.

@kdraeder
Copy link
Collaborator Author

I think that your attempt yesterday to install pre-commit was affected by the github outage. Please try again today.

I get the same errors today. Do I need to set up more python environment before these commands?

> setenv PYTHONUSERBASE ~/.local
> pip install --user pre_commit
ERROR: Can not perform a '--user' install. User site-packages are not visible in this virtualenv.
[notice] A new release of pip is available: 23.1.2 -> 25.3
[notice] To update, run: python3 -m pip install --upgrade pip

> conda install -c conda-forge pre_commit

EnvironmentNotWritableError: The current user does not have write permissions to the target environment.
environment location: /glade/u/apps/opt/conda
uid: 4241
gid: 69284

@jedwards4b
Copy link
Contributor

It appears that you are already in one of the system conda environments - you have two choices:

  1. Clone that environment and use conda install to add pre-commit to your clone
  2. get out of conda altogether and then run pip install --user

@kdraeder
Copy link
Collaborator Author

2. get out of conda altogether and then run pip install --user

I ran conda deactivate several times, but pip install --user {pre_commit} fails with the same message.
> conda env list
shows a dozen environments
Is there some other kind of deactivation I should use?

@jedwards4b
Copy link
Contributor

I suspect that the problem is that you load a conda environment in your dot files so that everytime you remove it it just reloads again.

@kdraeder
Copy link
Collaborator Author

I removed the conda parts of my .tcshrc file, and I don't see it in any other dot files.
I opened a new xterm on derecho, which is not aware of a conda command, so that seems good.
But the pip command, with and without --user, fails in the same ways.
You probably have much more pressing things to do, so if you'd like me to get help from the consultants,
I'll do that.

@billsacks
Copy link
Member

@kdraeder (and others) - I have pushed the change we discussed, using lowercase _c for calendar.

I tested by creating the following tests and checking the relevant xml variables (I didn't build or run the tests):

  • SMS_Ld3_D_P8x1_C3.f10_f10_mg37.X.green_gnu
  • SMS_Ld3_D_P8x1_cG.f10_f10_mg37.X.green_gnu
  • SMS_Ld3_D_P8x1_cG_C3.f10_f10_mg37.X.green_gnu

Note that I reverted some of the error message changes that had changed "_C" to "_C#": I felt it was inconsistent to refer to this as "_C#" when we just refer to "_N" without the "#", and now it's unnecessary with the change from _C to _c for calendar. I'd be fine returning to referring to this as "_C#", but if so, feel we should also refer to "_N#" for consistency.

@kdraeder - whenever you get a chance, let me know if you feel this is ready to go; if so, I can merge it.

@kdraeder
Copy link
Collaborator Author

@billsacks
Thanks for that context! The main goal is to develop a test of capabilities that DART needs.
For a while it looked like existing tests could do that, but now it's definitely feeling complicated and awkward.

@jedwards4b and I have both suggested the option of a new testmod or test, so I'll work on that.
Doing that would also make it possible to test transient case(s), as you and Erik developed for CTSM,
so it would be its own tidy package to test multiple things that DART needs.
In fact, dartcambigens does a lot of what's needed, so thanks for reminding me about that.
I'll do it in a new PR.

@kdraeder
Copy link
Collaborator Author

Note that I reverted some of the error message changes that had changed "_C" to "_C#":

This is good; it minimizes changes but now has a clear meaning.

@kdraeder
Copy link
Collaborator Author

but now it's definitely feeling complicated and awkward.

The most recent problem I was trying to solve is that if _cG is specified with an 1850 compset,
then the 1850_cam_lt.xml use case will be used, which fetches some CAM external forcing files
that can't handle leap day when asked to provide 1850 data.
A new use_case would solve that, but it seems too hard to implement, so we still need a solution.
Should it be in a testmod, which would override the 1850_cam_lt use_case
and request leap years (1852) from some files?
A testmod would be in CAM, but a note or restriction in cime would be helpful.

@jedwards4b
Copy link
Contributor

I guess I don't understand why you ever need to use a Gregorian calendar with an 1850 cyclic compset?

@kdraeder
Copy link
Collaborator Author

Maybe only people who are coming at the testing from the direction of
"I want to test all the run-starting modes of my CESM, ERI looks like one that does that efficiently,
and I want to use the gregorian calendar, which seems to be an option."
If they should really be using a different test, then there should be some error or warning
during the create_test build (or at least the documentation) about not using _cG with spinup compsets
andor not using transient compsets in ERI.
If that's the case, I'll work on a new test, not just a testmod.

@jedwards4b
Copy link
Contributor

I think that the ERI test is fine to use. But the 1850 compset is by definition a cyclic compset that runs the year 1850 over and over again - it never advances to another year. If you use the BHIST compset or another transient then you can use a Gregorian calendar.

@kdraeder
Copy link
Collaborator Author

I've been working under the assumption that the ERI should not handle transient compsets,
although @billsacks and @brian-eaton showed that can work with the right testmod.
I'll switch gears and try using a HIST compset with their fix and see whether the associated use_case
is sufficient.
Thanks for the clarification!

@kdraeder
Copy link
Collaborator Author

kdraeder commented Dec 1, 2025

  • I imported @billsacks changes to use a _cG modifier instead of _CG.
  • I added a check for incompatibility between _cG and the chosen compset; no YYYY... compsets to handle all spinup compsets.
  • I added variables to the cam testmod outfrq9s_leapday/user_nl_clm to prevent CLM from stopping due to year inconsistencies (from @billsacks and @brian-eaton ). Hopefully this won't break any other tests in ways they shouldn't be broken.
  • This test uses the default use_case for the chosen HIST compset, so I've deleted the use_case I added earlier (leapyears_cam_lt.xml).

The test result mostly looks like it succeeded:
with GREG_TEST = ERI.ne3.FHISTC_LT.outfrq9s_leapday/ERI_D_C3_cG_Ld8.ne3_ne3_mg37.FHISTC_LTso.derecho_intel.cam-outfrq9s_leapday

GREG_TEST (Overall: FAIL) details:
PASS GREG_TEST CREATE_NEWCASE
PASS GREG_TEST XML
PASS GREG_TEST SETUP
PASS GREG_TEST SHAREDLIB_BUILD time=138
PASS GREG_TEST MODEL_BUILD time=131
PASS GREG_TEST SUBMIT
PASS GREG_TEST RUN time=3847
> FAIL GREG_TEST COMPARE_base_hybrid
PASS GREG_TEST COMPARE_base_rest
PASS GREG_TEST MEMLEAK insufficient data for memleak test
PASS GREG_TEST SHORT_TERM_ARCHIVER

Does this mean the test or code is flawed?
Is the _base_hybrid case missing, so I should make one, or try a different resolution, or ...?

@billsacks
Copy link
Member

@kdraeder - from looking at /glade/work/raeder/Exp/CESM+DART_testing/ERI.ne3.FHISTC_LT.outfrq9s_leapday/ERI_D_C3_cG_Ld8.ne3_ne3_mg37.FHISTC_LTso.derecho_intel.cam-outfrq9s_leapday.20251126_150633_nbnwy9/TestStatus.log (look for "had no original counterpart"), it looks like the issue is that cice and mosart both have history files that are present in one case but not another. This can cause test failures. I'm guessing this is coming because of the combination of start date and length of the test, which leads to a cross of the month boundary for one run but not another. I think the best solution is to ensure that all components in this test have at least daily output by adding user_nl_mosart and user_nl_cice files for this testmod that give daily or more frequent output for those components.

I added variables to the cam testmod outfrq9s_leapday/user_nl_clm to prevent CLM from stopping due to year inconsistencies (from @billsacks and @brian-eaton ). Hopefully this won't break any other tests in ways they shouldn't be broken.

I'm not concerned about these changes (check_finidat_year_consistency = .false. and for_testing_allow_non_annual_changes = .true.) breaking other tests: all these do is to bypass some error checks that we typically want enabled. My main concern would be bypassing these error checks in cases where they don't actually need to be bypassed, if this testmod is used for more general purposes. That wouldn't be a huge deal, though as a general rule I prefer to make the fewest changes needed from the out-of-the-box setup, if it's not too much trouble.

I don't have strong feelings about what should be done in this particular case, though in case it's helpful, I'll point out the capability to have multiple testmods so you can add some settings in a second testmod without altering the original: e.g.,:

ERI_D_C3_cG_Ld8.ne3_ne3_mg37.FHISTC_LTso.derecho_intel.cam-outfrq9s_leapday--cam-MY_NEW_TESTMOD

(note the -- syntax for combining multiple testmods).

@kdraeder
Copy link
Collaborator Author

kdraeder commented Dec 2, 2025

Thanks to @cacraigucar for setting me up with the best CESM and CAM to use for a new test
and to @billsacks for identifying the problem with the cice and mosart history frequencies.

/glade/work/raeder/Exp/CESM+DART_testing/ERI.ne3.FHISTC_LT.outfrq9s_leapday+cice+mos-daily2
seems to have passed all the tests, so I'll push the cime changes to this PR and the cam changes to
CAM PR #1437

Let me know if I should check anything besides the output from cs.status.

@billsacks
Copy link
Member

@kdraeder - great, thanks for all of your work on this! I think that if the test passes according to cs.status, that's sufficient testing for the CIME changes (the CAM folks may want more). Please let us know once you have pushed all of your CIME changes.

@billsacks billsacks requested a review from jgfouca December 2, 2025 23:27
@cacraigucar
Copy link
Contributor

@kdraeder - I believe the CAM changes are minimal (once you do your push of your changes and remove the unused use_case). I would suggest you go ahead and push your changes to CAM as well.

@billsacks
Copy link
Member

@jgfouca - I'm requesting a formal approval from you on this before merging since it changes behavior of the ERI test and will lead to baseline comparison failures for the ERI test due to "no original counterpart" / "no compare counterpart", because now we add 4 years instead of 2. Let us know if that's a problem from the E3SM side.

Copy link
Contributor

@jgfouca jgfouca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We (E3SM) don't have any ERI tests in any of our suites, so I'm fine with a behavior change if you guys need it.

CIME/test_scheduler.py
  Uses _CG -> _cG change from Sacks
  Checks for _cG incompatibility with the compset (spinup not allowed)

Also needs changes to cam:
  outfrq9s_leapday/user_nl_clm (from Sacks and Eaton)
     Prevented CLM failure due to year check
  outfrq9s_leapday/user_nl_{cice,mosart}
     Added to make history output daily, which makes the ERI output
     comparisons work.
  outfrq9s_leapday/shell_scripts
     Kept the change of start year from 2012 to 2008

The ERI_D_C3_cG_Ld8.ne3_ne3_mg37.FHISTC_LTso.derecho_intel.cam-outfrq9s_leapday
test passes.  It has not been tested for standard grids andor B compsets yet.

Related changes to system_testing.rst will be suggested in a later PR.
Comment on lines 694 to 702
elif case_opt.startswith("cG"):
# Need to examine the long name.
# spinup cases are 1850, 2000, 2010
# not HIST, SSP###
spinup_po = re.compile(r'([\d]{1,4})')
spinup = spinup_po.match(compset)
expect(spinup == None,
"Gregorian calendar (_cG) cannot be used with spinup compset \n {}".format(compset)
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer that this check not be here, for a few reasons:

First, I don't believe that there is a fundamental limitation of the system that you can't use the Gregorian calendar with spinup compsets. My impression from the discussion here is that there are problems in some components (e.g., CAM) when you try to use a Gregorian calendar in a spinup compset because of the way they have set up streams, but this is not a fundamental limitation of the system, and in the future we may want to fix that component behavior or have Gregorian spinup tests of a compset with components that don't have that issue – for example, I would expect a Gregorian X compset (coupler-test compset) test to pass currently; I haven't tried that, but if it fails I would like to be able to see that failure in whatever way it shows up rather than having CIME prevent me from running it at all.

An additional concern I have is that the mechanism for checking whether it's a spinup compset, by checking whether the compset name starts with a number, is - I believe - model-specific, not something that's prescribed by CIME, and so it doesn't feel like this logic belongs in CIME (though I could be wrong about that).

Also, if people feel it's important to have this check, then it should probably be done in a way that catches (probably more common) alternative ways of setting CALENDAR – i.e., by setting the CALENDAR XML variable after creating a case. And if this check is kept, I would prefer to see it moved to the component(s) where it is actually an issue. So, for example, this could be moved to the CAM build-namelist.

@kdraeder and/or @jedwards4b - if, after all this, you still feel it's best to have this check in CIME, I can accept it, but I then have a few specific questions / requests about the implementation of this check:

  • Can it be put in the block that already parses the cG option (further down in this file) rather than having the cG option checked in two places?
  • Is this regex really what's intended... in particular the inclusion of 1 in the {1,4}? It looks like this will match any compset that starts with a single digit... and if that's the intent then a simpler regex of just '\d' would be sufficient. I think the intent may be to only match a compset that starts with 4 digits, in which case I think the regex you want is r'\d{4}'
  • If this is kept, I'd like a more detailed comment explaining what the problem is with having a Gregorian test of a spinup case, with a goal of explaining whether / when this check could be removed in the future.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that the assumption that if the compset name starts with a digit it's a spinup compset is misguided and this check should be removed. However I do not understand the utility or need to support a Gregorian calendar in a spinup compset.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe that there is a fundamental limitation of the system that you can't use the Gregorian calendar with spinup compsets.

@billsacks Thanks for your insights!

Maybe there should be no gregorian modifier and it should be handled in the place(s) Bill suggests.
If it is handled in create_test, and if there's an actual limitation, there should be a check somewhere.
I agree that it should be clear, focused, and easy to remove.

I agree that there's not a fundamental limitation, but in addition to @JEdwards point, we discussed (above) the difficulty of specifying the correct (external forcing) namelist entries via the use_cases. It's beyond me to make the use_case aware of the calendar, so that it could specify suitable year(s) to extract from the forcing files (like it does with the grid). It could probably be done through a new testmod(s?). I haven't figured out why both use_cases and test_mods can alter the namelists, but one is user-selectable and the other isn't. That might answer the question of whether and how ERI for spinup compsets should handle gregorian.
If this were moved to the components' build-namelist, would the user have control over it in a way that's clear to people who run these tests?

If this limitation will stay for the time being, then:
I wasn't completely comfortable relying on the compset starting with 4 digits to flag spinup cases,
so if there are better solutions we should use them. But that seemed to be the convention.
Yes, the {1,4} should be {4} or {4,4}.

I tried to put this check in a place where _cG was already being handled, but it seemed that the compset was not available there. If it can be made available, then moving the check is good. Moving it into the component(s) would be fine too. It will take me a while to figure out how to do that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However I do not understand the utility or need to support a Gregorian calendar in a spinup compset.

I also don't see enough scientific need for this to justify spending time on getting it working right now, but I don't think it should be fundamentally dis-allowed. Two things pop to mind: First, the use of a 365-day calendar by default is a model-specific thing, not a CIME thing; in principle, a model using CIME could decide that it wants to use a Gregorian calendar for all of its cases by default. (So I guess this is just a further argument that, if there is any logic about this, it should be in a component, not in CIME.) Second, I can at least see some theoretical reasons why someone might want to use a Gregorian calendar for a spinup run in CESM. One big one is for consistency: If you know you're going to use a Gregorian calendar for the later transient run, then you may want your spinup run to be consistent with that. I could see there being other reasons, too, that I / we aren't thinking of now.

I am generally in favor of adding error checking like this along with meaningful error messages to catch things that we know can't work together. I just want to see this localized to the place where the problem actually occurs so we don't end up preventing things that should actually work.

Maybe there should be no gregorian modifier and it should be handled in the place(s) Bill suggests.

I don't have strong opinions one way or another on this. But I'd love to see this PR move towards resolution, so unless anyone else has strong-ish feelings, I'd vote for keeping this modifier in place as it is in this PR now.

we discussed (above) the difficulty of specifying the correct (external forcing) namelist entries via the use_cases

Yes, agreed... but I feel this is a CAM limitation, not a CIME limitation, so if there is a check, it belongs in CAM's build-namelist.

I haven't figured out why both use_cases and test_mods can alter the namelists, but one is user-selectable and the other isn't.

Don't get me started on use_cases... the short answer is that I'm not a fan of them, but CAM uses them extensively, so... 🤷🏻‍♂️

If this were moved to the components' build-namelist, would the user have control over it in a way that's clear to people who run these tests?

I'm not sure what you're asking here; can you clarify?

I wasn't completely comfortable relying on the compset starting with 4 digits to flag spinup cases,
so if there are better solutions we should use them.

There may be some additional options opened up if this check is moved into CAM's build-namelist, which I think is where it belongs. At the very least, it would at least be somewhat better to have this assumption about compset name be there rather than in CIME.

I tried to put this check in a place where _cG was already being handled, but it seemed that the compset was not available there. If it can be made available, then moving the check is good. Moving it into the component(s) would be fine too. It will take me a while to figure out how to do that.

Ah, okay, thanks for clarifying.

I have appreciated your initiative in trying to get all of the needed changes in place yourself. That said, my suggestion at this point would be to take this check out of CIME and open an issue in CAM requesting a more clear error message when someone tries to use a GREGORIAN calendar in a non-transient case, and thus starting a discussion with the CAM developers about the best way to implement the check there.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this were moved to the components' build-namelist, would the user have control over it in a way that's clear to people who run these tests?

I'm not sure what you're asking here; can you clarify?

I confused myself here. I was thinking that the _cG might be moved from create_case to build-namelist, but we're talking about where the check should be, so _cG would stay in create_case
and be passed to build-namelist.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a check of compset+calendar would need to be in several build-namelists

It's possible that other components don't have this issue and can safely be run with a GREGORIAN calendar. Actually, now I'm confused about why you ran into issues with this in CAM: digging into the CDEPS code, it looks like streams should be set up to handle discrepancies between the model calendar and the streams calendar:

https://github.com/ESCOMP/CDEPS/blob/95e0bac4a723496b0da57d9d290c86e449c97238/streams/dshr_strdata_mod.F90#L807-L854

And indeed, I just successfully ran this CLM-only test: ERI_D_Ld9.f10_f10_mg37.I1850Clm60Bgc.derecho_gnu.clm-gregorian (where I introduced a new test mod that simply set CALENDAR=GREGORIAN to avoid depending on the changes in this cime PR)

This is a further argument to me that this error check should only exist in the component(s) that actually have an issue running this case.

Maybe the issue is that CAM doesn't use CDEPS's streams code in many places, but instead implements its own stream reading? From one spot-check, it looks like attempts were made even there to handle this discrepancy as well... though maybe this doesn't handle all cases, and maybe not all of CAM's streams have similar handling?:

https://github.com/kdraeder/CAM/blob/fc28fce1ff3f2a33840536e7128cb1eb8337b277/src/chemistry/utils/mo_flbc.F90#L130-L137

Should this be a separate PR? If it is, then we'll be closing this PR with code that still has
one of the problems that led me to open the issue. I guess we could close it with a note about that.
And would the second PR be another partial fix of issue #4811, or a new issue?

It would need to be a PR in CAM - you could ask over in your existing CAM PR whether they want you to include it in your existing CAM PR or open a new CAM PR. It's fine to merge this CIME PR without everything from #4811 being fully resolved, especially because some of the issues in #4811 are really outside of CIME. The CAM PR(s) would be a fix of a new issue, in CAM.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just following up from my last comment: I meant to add:

Based on what I see in code and my being able to run an 1850 Gregorian case in CLM, I think you should only worry about putting an error check about Gregorian+spinup compset into CAM, unless you have run into issues running this combination in other components. And even in CAM, it may be that this is a bug and that in fact this combination should work... I'd suggest discussing this with the CAM developers.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed this inconsistency check from my clone by resetting back to @billsacks latest commit (0afcb5f).
> git reset --hard 0afcb5fb2
My clone (and branch) is now 8 commits behind my github kdraeder fork. Git suggests that I update my clone, but I want to change my fork instead, to make 0afcb5f be the HEAD(?). Can I do that by pushing my clone to my fork? Or do I need to do something directly in my fork?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the issue is that CAM doesn't use CDEPS's streams code in many places, but instead implements its own stream reading?

My early 1850 test failed with:

PET00 ESMCI_Calendar.C:1059 ESMCI::Calendar::convertToTime()
Input argument out of range - ;
Gregorian: for February 1850, dd=29 > 28 days in the month.

Which led me to

&chem_surfvals_nl
flbc_cycle_yr = 1850

It looked like I should just change these years to leap years (in the 1850_cam_lt use_case).
I didn't track down any potential coding causes of the failure; it seemed more likely that the namelist values were set incorrectly than that the code is insufficient.
I'll describe this more in the CAM issue that I open.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First: I realized that my CTSM test wasn't a good one because it started Jan 1. I'm retrying with a test that starts in late Feb of 1852 and will post back here when I see those test results.

I've removed this inconsistency check from my clone by resetting back to @billsacks latest commit (0afcb5f).
> git reset --hard 0afcb5fb2
My clone (and branch) is now 8 commits behind my github kdraeder fork. Git suggests that I update my clone, but I want to change my fork instead, to make 0afcb5f be the HEAD(?). Can I do that by pushing my clone to my fork? Or do I need to do something directly in my fork?

It can be very problematic to use git reset --hard on a branch that has already been pushed, since pushing after that would be a destructive operation that rewrites history. As a general rule, you should only use that on your own work before pushing it to a place shared with other people, and even then it should be used with extreme caution because you can lose work with that command.

Here's what you should do at this point:

(1) Update to the latest version: git pull

(2) Manually delete your recently-added check, commit and re-push. There are ways you could do this with git, but in this case, since there are multiple commits touching it and it's just a single self-contained block of code, it will be easier to just remove this manually and do an additional commit.

It looked like I should just change these years to leap years (in the 1850_cam_lt use_case).
I didn't track down any potential coding causes of the failure; it seemed more likely that the namelist values were set incorrectly than that the code is insufficient.
I'll describe this more in the CAM issue that I open.

Sounds good. So it looks like the failure is in the module where I did find some handling of leap-year discrepancies, but maybe some other handling is needed... I haven't dug into this code very far.

@billsacks
Copy link
Member

billsacks commented Dec 4, 2025

Okay, this test basically passed:

ERI_D_Ld9.f10_f10_mg37.I1850Clm60Bgc.derecho_gnu.clm-gregorian_feb27

where the test mod contains:

./xmlchange CALENDAR=GREGORIAN
./xmlchange RUN_STARTDATE=1852-02-27

It failed the comparison of base & hybrid, but that's just because I forgot to turn on daily output so there were some missing history files. But the runs themselves were successful. So, unless I'm forgetting / missing something else, it seems like it is possible to run a Gregorian case for a spinup compset. Note that CTSM is using some streams with fixed-1850 years in this case, so some correction must be happening for the leap year.

I have a Feb 29 start date one in the queue.

  • Update: ERI_D_Ld9.f10_f10_mg37.I1850Clm60Bgc.derecho_gnu.clm-gregorian_feb29 also runs successfully (though fails the COMPARE_base_hybrid, as above).

Copy link
Member

@billsacks billsacks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me now. Thank you for all of your iterations on this, @kdraeder !

I see there is a failing test that seems unrelated to this PR. I'm going to try updating to latest master and rerunning the tests.

@billsacks billsacks merged commit d5e1e03 into ESMCI:master Dec 5, 2025
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

testmodifier overdefined

5 participants