SIPNET workflow for restarting with events#3919
SIPNET workflow for restarting with events#3919ashiklom wants to merge 77 commits intoPecanProject:developfrom
Conversation
...but we reset to initial conditions every time.
Based on the actual JSON contents.
c40f98a to
f061e09
Compare
start jan 1 of first planting year end dec 31 of last harvest year
|
OK, I think I've resolved all of @divine7022 and @infotroph comments. I also resolved a significant bug with Note that this PR sits on top of #3836 and #3828 --- if we merge this, we can probably just close those.
All of this already works in this PR. The new settings <- PEcAn.workflow::runModule.run.write.configs(
settings_raw,
input_design = sens_design$X
)
source("workflows/sipnet-restart-workflow/utils.R")
jobfiles <- write_segmented_configs.SIPNET(settings, sens_design$X, force_rerun = TRUE)
PEcAn.workflow::runModule_start_model_runs(settings)Whatever method the user has specified for running |
infotroph
left a comment
There was a problem hiding this comment.
This looks great! I was able to run through 3 ensembled sites in a lightly modified MAGiC ensemble workflow using just utils.R with the changes noted below. I think this is ready to merge. Next iteration can be to decide which package(s) to drop the functions into.
| cls == "F" ~ "annual_crop", | ||
| cls == "G" ~ "grass", | ||
| cls == "P" ~ "grass", | ||
| cls == "R" ~ "grass", |
There was a problem hiding this comment.
Yes, this table is still temporary, but needed this one for the sites I grabbed
| cls == "R" ~ "grass", | |
| cls == "R" ~ "grass", | |
| cls == "T" ~ "annual_crop", |
| if (!file.exists(manifest_file)) { | ||
| PEcAn.logger::logger.severe("Could not find manifest file: ", manifest_file) | ||
| } | ||
| inputs_runs <- read.csv(manifest_file) |
There was a problem hiding this comment.
| inputs_runs <- read.csv(manifest_file) | |
| inputs_runs <- read.csv(manifest_file) |> | |
| dplyr::filter(.data$site_id == settings$run$site$id) |> | |
| # TODO the manifest should probably report these already... | |
| dplyr::mutate( | |
| ens_num = .data$run_id |> | |
| stringr::str_extract("ENS-(\\d+)", group = 1) |> | |
| as.integer() | |
| ) |
| } | ||
| inputs_runs <- read.csv(manifest_file) | ||
| if (!is.null(input_design)) { | ||
| inputs_runs <- cbind.data.frame(inputs_runs, input_design) |
There was a problem hiding this comment.
In a multi-site run the cbind winds up becoming a cross join. Need to align explicitly by ensemble number
| inputs_runs <- cbind.data.frame(inputs_runs, input_design) | |
| inputs_runs <- inputs_runs |> | |
| dplyr::left_join( | |
| input_design |> tibble::rowid_to_column("ens_num"), | |
| by = "ens_num", | |
| relationship = "many-to-one") |
| stopifnot(file.exists(events_json)) | ||
|
|
||
| crop_cycles <- PEcAn.data.land::events_to_crop_cycle_starts(events_json) |> | ||
| dplyr::ungroup() |
There was a problem hiding this comment.
| dplyr::ungroup() | |
| dplyr::filter(.data$site_id == run_settings$run$site$id) |> | |
| dplyr::ungroup() |
| pft = crop2pft(.data$crop_code), | ||
| segment_dir = file.path(segment_rootdir, sprintf("segment_%s", .data$segment_id)) | ||
| ) | ||
|
|
There was a problem hiding this comment.
Seems useful to retain this information for diagnostics. I didn't think carefully about format or location, though -- counterproposals welcome
| write.csv(segments, file = file.path(run_dir, "segments.csv"), row.names = FALSE) | |
| ) | ||
|
|
||
| source("workflows/sipnet-restart-workflow/utils.R") | ||
| jobfiles <- write_segmented_configs.SIPNET(settings, sens_design$X) |
There was a problem hiding this comment.
Just leaving a breadcrumb hint for the next person
| jobfiles <- write_segmented_configs.SIPNET(settings, sens_design$X) | |
| jobfiles <- write_segmented_configs.SIPNET(settings, sens_design$X) | |
| # Note: If running a multi-site workflow, use: | |
| # jobfiles <- papply(settings, \(s) write_segmented_configs.SIPNET(s, sens_design$X)) |
| settings | ||
| } | ||
|
|
||
| # TODO: We need a better, consistent implementation of this. However, this is |
There was a problem hiding this comment.
I thought this table was with the landiq code. If not, I think @infotroph , @sarahkanee , and I have each implemented one or more versions of this mapping.
Prototype of running SIPNET with event files that include changes in crops. A few implementation notes:
This supports PEcAn ensemble inputs (including for events)
event.jsonfiles for multiple ensemble members are stored inrun$inputs$events$source. SIPNET-specificevent.infiles are stored inrun$inputs$events$path(like other inputs). This is because the current functionality for finding segments to split the runs uses the JSON files. @infotroph 'ssubset_pathsfunction from an earlier draft has been adapted and modified to subset bothpathandsourcepaths (as long as they have the same lengths).This circumvents
runModule.start.model.runsand uses a direct execution loop instead. However, the output is PEcAn standard and follows PEcAn configuration conventions, so downstream analyses/workflows should work out of the box.It's a bit hacky, but I think it does what it's supposed to and should be enough to unblock other CCMMF modeling tasks (@dlebauer).