fix restart issues that arise with time averaged fields#27
Open
mvertens wants to merge 2 commits into
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Restores bit-for-bit reproducibility across restarts by reverting the
esm_history_varsdefault incime_config/namelist_definition_cism.xmlfrom
_tavgvariants back to instantaneous fields for bothgrisandaisice sheets.Root cause
In
cismwrap v8the defaultesm_history_varslist was changed frominstantaneous fields (e.g.,
acab_applied,calving_rate,total_smb_flux)to their time-averaged variants (
acab_applied_tavg,calving_rate_tavg,total_smb_flux_tavg, etc.). With these_tavgvariants in the historylist, the model no longer reproduces bit-for-bit across a restart.
The
_tavgfields are auto-generated byutils/build/generate_ncvars.pyin CISM. The codegen explicitly sets
load: 0on every generated variant,so the
_tavgaccumulator arrays are never written to or read from theCISM restart file. The per-output-file scalar (
outfile%total_time) thatnormalizes the accumulator at write time is also reset whenever a new
output file is opened post-restart. As a result, a continuous run and a
restart run accumulate over different windows and divide by different
total_timevalues, producing different history-file values at the samewall-clock time.
The prognostic state is still bit-for-bit across restart — only the
diagnostic
_tavgoutputs differ — but any BFB check that compareshistory files flags the run as a failure.
This was masked in earlier wrapper releases because
esm_history_varslisted only instantaneous fields, which are deterministic at the output
moment and do not depend on accumulator state.
What this PR changes
cime_config/namelist_definition_cism.xml: for the twocism_evolve_this_icesheet=".true."entries (grisandais), dropthe
_tavgsuffix from all fields.frontal_melt_rate_tavgis mappedto the corresponding instantaneous variable
melt_rate. All other v8additions (per-icesheet split, the extra prognostic fields like
velnorm,btemp,tempstag,powerlaw_c,base) are kept.Trade-off
The history files lose time-averaging in favour of snapshot values at the
output moment. Time-averaged diagnostics can be reconstructed post-hoc, or
reintroduced after the upstream CISM
_tavgrestart-save issue is fixed.Upstream issue
The proper fix lives in CISM: both the
<var>_tavgarrays(
generate_ncvars.pyline 126) and theoutfile%total_timescalar(
libglimmer/ncdf_template.F90.in) need to be saved in the restart filetogether. That work is tracked separately.
Test plan
./preview_namelists) on an existing case andconfirm
esm_history_varsincism.<icesheet>.configno longercontains any
_tavgentries.year-3 history files are bit-for-bit identical between the continuous
and restart runs.
run that didn't trigger the BFB-restart check (i.e., the model
trajectory itself is unaffected — only what gets written to history).