-
Notifications
You must be signed in to change notification settings - Fork 1
Single stream processing #49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
klendathu2k
wants to merge
82
commits into
main
Choose a base branch
from
single-stream-rebase
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Changes from all commits
Commits
Show all changes
82 commits
Select commit
Hold shift + click to select a range
3ba7196
Define a 'streamname' and 'streamfile' argument
klendathu2k 77225b5
Parameterize the output filename
klendathu2k b360159
Single streaming event builder workflow.
klendathu2k ddd4b30
Run hit unpacked on the single stream outputs.
klendathu2k bacc9eb
Pull in the clustering step. Still need to rework the input query.
klendathu2k 30ce362
Extend the match with stream name and stream file IF its defined in t…
klendathu2k 9e22241
The output of each job will be different based on the stream name
klendathu2k ade1181
For optimizing the lookup of existing outputs we will use the dsttype…
klendathu2k 8799f60
...
klendathu2k fbbb536
... cleanup ...
klendathu2k a18d1e6
Runnumber, segment number and output file are not used in mapping the…
klendathu2k f9bc6d6
... cleanup ...
klendathu2k d895928
A question about how to handle the naming of the production setup.
klendathu2k bdbd4f9
... cleanup ...
klendathu2k 7d14a24
fetch_production_status ignored the dstname and should never reach th…
klendathu2k 9301fc2
Even if we reached this code, the table already exists. So ...
klendathu2k eefce5f
We don't need to try to create the table here.
klendathu2k 5d4c1fd
The table exists or the system has not been created properly. The na…
klendathu2k 6ef62a3
Should be better optimized this way.
klendathu2k df96354
And we can rid ourselves of the unused arguments.
klendathu2k cc02a85
... cleanup ...
klendathu2k e4b4e30
file_basename is not used here. Rather we are mapping the dstfile (o…
klendathu2k 71b9266
... simplify ...
klendathu2k 2674679
... cleanup ...
klendathu2k 511d9e5
... cleanup ...
klendathu2k 526fdb8
Cleanup and impose the equal length condition on fc_result and outputs.
klendathu2k 2fe8d39
Not sure why this is showing up as a difference, but commit it separa…
klendathu2k 0223b20
Any job submitted with a stream name will be mapped to the same produ…
klendathu2k 0e831fb
dstname is no longer an argument to this function.
klendathu2k 22355b7
When streamname is provided we expect to need condor substitution for…
klendathu2k 68ad775
Remove unused query.
klendathu2k bef830f
We no longer have a fixed name... each match varies at the level of s…
klendathu2k 4a92736
... cleanup ...
klendathu2k a33f04c
Should be better optimized.
klendathu2k ae3d708
When a streamname is specified, the triplet (run,segment,streamname) …
klendathu2k d53cf81
Update of the production status depends on the stream name.
klendathu2k 2558f3b
First crack at a 'closeout' dataset.
klendathu2k 49eddf1
Moving the database connections down into the update and fetch functi…
klendathu2k 0e7fa90
At the risk of creating a subroutine, make all status updates follow …
klendathu2k 63a51eb
Ugly as an early return is... this is the logic if I want to log fail…
klendathu2k b90b732
Define a 'streamname' and 'streamfile' argument
klendathu2k 93a776a
Parameterize the output filename
klendathu2k 6b0112e
Single streaming event builder workflow.
klendathu2k 17f9a27
Run hit unpacked on the single stream outputs.
klendathu2k 3d5dcc5
Pull in the clustering step. Still need to rework the input query.
klendathu2k 202602b
Extend the match with stream name and stream file IF its defined in t…
klendathu2k a598bcf
The output of each job will be different based on the stream name
klendathu2k a023725
For optimizing the lookup of existing outputs we will use the dsttype…
klendathu2k 875c18c
...
klendathu2k f4c3da9
... cleanup ...
klendathu2k 0e6fc63
Runnumber, segment number and output file are not used in mapping the…
klendathu2k 04590f3
... cleanup ...
klendathu2k a4ed824
A question about how to handle the naming of the production setup.
klendathu2k 52115a2
... cleanup ...
klendathu2k 87818e1
fetch_production_status ignored the dstname and should never reach th…
klendathu2k ea3d828
Even if we reached this code, the table already exists. So ...
klendathu2k 325a457
We don't need to try to create the table here.
klendathu2k d1777c6
The table exists or the system has not been created properly. The na…
klendathu2k 487121f
Should be better optimized this way.
klendathu2k 3965df6
And we can rid ourselves of the unused arguments.
klendathu2k 59540de
... cleanup ...
klendathu2k 679d255
file_basename is not used here. Rather we are mapping the dstfile (o…
klendathu2k 3a149c8
... simplify ...
klendathu2k 1b2947a
... cleanup ...
klendathu2k 15d2fb7
... cleanup ...
klendathu2k 615c863
Cleanup and impose the equal length condition on fc_result and outputs.
klendathu2k 2be62bc
Not sure why this is showing up as a difference, but commit it separa…
klendathu2k 86d0c3d
Any job submitted with a stream name will be mapped to the same produ…
klendathu2k d6c9504
dstname is no longer an argument to this function.
klendathu2k 7e642a2
When streamname is provided we expect to need condor substitution for…
klendathu2k c6d4442
Remove unused query.
klendathu2k 4da8226
We no longer have a fixed name... each match varies at the level of s…
klendathu2k 933b2fa
... cleanup ...
klendathu2k 8272067
Should be better optimized.
klendathu2k a9328ec
When a streamname is specified, the triplet (run,segment,streamname) …
klendathu2k 45f14dd
Update of the production status depends on the stream name.
klendathu2k 14a68b9
First crack at a 'closeout' dataset.
klendathu2k 51f5bcc
Merge branch 'single-stream-rebase' of https://github.com/klendathu2k…
klendathu2k 0f86b3e
Modify python path for alma linux
klendathu2k d8c8e4f
add in cups statistics
klendathu2k 844f165
Saving changes on the local working branch that may not be present in…
pinkenburg 25796cd
Merge branch 'main' into single-stream-rebase
klendathu2k File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
202 changes: 202 additions & 0 deletions
202
production-rules/DST_STREAMING_EVENT_run2pp_ana435_2024p007.yaml
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,202 @@ | ||
| PHYS_DST_SINGLE_STREAMING_EVENT_run2pp: | ||
|
|
||
| params: | ||
| name: DST_STREAMING_EVENT_{streamname}_run2pp | ||
| build: ana.435 | ||
| build_name: ana435 | ||
| dbtag: 2024p007 | ||
| logbase : $(name)_$(build)_$(tag)-$INT(run,{RUNFMT})-$INT(seg,{SEGFMT}) | ||
| outbase : $(name)_$(build)_$(tag) | ||
| script : run_cosmics.sh | ||
| payload : ./slurp-examples/sPHENIX/cosmics/ | ||
| mem : 20480MB | ||
| neventsper: 1000 | ||
| comment : "---" | ||
| rsync : "./slurp-examples/sPHENIX/cosmics/*,cups.py,bachi.py,odbc.ini" | ||
|
|
||
| input: | ||
| db: daqdb | ||
| direct_path: /sphenix/lustre01/sphnxpro/{mode}/*/physics/ | ||
| query: |- | ||
| with partialrun as ( | ||
| select 'daqdb/filelist' as source , | ||
| runnumber , | ||
| 0 as segment , | ||
| string_agg( distinct split_part(filename,'/',-1), ' ' ) as files , | ||
| string_agg( distinct split_part(filename,'/',-1) || ':' || firstevent || ':' || lastevent, ' ' ) as fileranges | ||
|
|
||
| from filelist | ||
| where | ||
| ( | ||
| (filename like '/bbox%/{streamfile}%-0000.evt' and lastevent>2 ) or | ||
| (filename like '/bbox%/GL1_physics%-0000.evt' and lastevent>2 ) | ||
| ) | ||
| {run_condition} | ||
|
|
||
| group by runnumber | ||
| having | ||
| every(transferred_to_sdcc) and | ||
| max(lastevent)>1000 and | ||
|
|
||
| sum( case when filename like '/bbox%/GL1_physics%' then 1 else 0 end )>0 and | ||
| sum( case when filename like '/bbox%/{streamfile}%' then 1 else 0 end )>0 | ||
|
|
||
| order by runnumber | ||
| ), | ||
|
|
||
| fullrun as ( | ||
| select | ||
| 'daqdb/filelist' as source , | ||
| runnumber , | ||
| 0 as segment , | ||
| string_agg( distinct split_part(filename,'/',-1), ' ' ) as files , | ||
| string_agg( distinct split_part(filename,'/',-1) || ':' || firstevent || ':' || lastevent, ' ' ) as fileranges | ||
| from | ||
| filelist | ||
| where | ||
| ( | ||
| (filename like '/bbox%/{streamfile}%.evt' and lastevent>2 ) or | ||
| (filename like '/bbox%/GL1_physics%.evt' and lastevent>2 ) | ||
|
|
||
| ) | ||
| {run_condition} | ||
|
|
||
| group by runnumber | ||
| having | ||
| every(transferred_to_sdcc) and | ||
| max(lastevent)>1000 and | ||
|
|
||
| sum( case when filename like '/bbox%/GL1_physics%' then 1 else 0 end )>0 and | ||
| sum( case when filename like '/bbox%/{streamfile}%' then 1 else 0 end )>0 | ||
|
|
||
| order by runnumber | ||
| ) | ||
|
|
||
| select *,'partial run' as runtype from partialrun where runnumber not in ( select runnumber from fullrun ) | ||
| union all | ||
| select *,'full run' as runtype from fullrun where true | ||
|
|
||
| ; | ||
|
|
||
| # TODO: Need to add error checking to make sure that outdir, logdir, etc... are quoted properly. Else, this will cause problems with argument substitution | ||
| filesystem: | ||
| outdir : "/sphenix/lustre01/sphnxpro/physics/slurp/streaming/physics/$(build)_$(tag)/run_$(rungroup)" | ||
| logdir : "file:///sphenix/data/data02/sphnxpro/streaminglogs/$(build)_$(tag)/run_$(rungroup)" | ||
| histdir : "/sphenix/data/data02/sphnxpro/streamhist/$(build)_$(tag)/run_$(rungroup)" | ||
| condor : "/tmp/testlogs/$(build)_$(tag)/run_$(rungroup)" | ||
|
|
||
| # | ||
| # Again I note the need to ensure that the arguments are properly specified given the | ||
| # definition of the payload script. | ||
| # | ||
| job: | ||
| executable : "{payload}/run_cosmics.sh" | ||
| arguments : "$(nevents) {outbase} {logbase} $(run) $(seg) {outdir} $(build) $(tag) $(inputs) $(ranges) {neventsper} {logdir} {comment} {histdir} {PWD} {rsync}" | ||
| output_destination : '{logdir}' | ||
| log : '{condor}/{logbase}.condor' | ||
| accounting_group : "group_sphenix.mdc2" | ||
| accounting_group_user : "sphnxpro" | ||
| priority : '4000' | ||
| request_xferslots: '0' | ||
|
|
||
|
|
||
|
|
||
| #_____________________________________________________________________________________________________________________________ | ||
|
|
||
| PHYS_DST_SINGLE_TRKR_HIT_SET_physics_2024p007: | ||
| # DST_EVENT works from a pre-built set of run lists. | ||
| params: | ||
| name: DST_TRKR_HIT_{streamname}_run2pp | ||
| build: new | ||
| build_name: new | ||
| dbtag: 2024p007 | ||
| logbase : $(name)_$(build)_$(tag)-$INT(run,{RUNFMT})-$INT(seg,{SEGFMT}) | ||
| outbase : $(name)_$(build)_$(tag) | ||
| script : run.sh | ||
| payload : ./slurp-examples/sPHENIX/TrackingProduction/ | ||
| mem : 2048MB | ||
| rsync : "./slurp-examples/sPHENIX/TrackingProduction/*,cups.py,bachi.py,odbc.ini" | ||
|
|
||
| input: | ||
| db: fc | ||
| query: |- | ||
| select | ||
| 'filecatalog/datasets' as source , | ||
| runnumber , | ||
| segment , | ||
| filename as files , | ||
| 'X' as fileranges | ||
| from | ||
| datasets | ||
| where | ||
| filename like 'DST_STREAMING_EVENT_{streamname}_run2pp_ana435_2024p007%' | ||
| {run_condition} | ||
| and runnumber>=49700 | ||
| order by runnumber | ||
| {limit_condition} | ||
| ; | ||
| filesystem: | ||
| outdir : "/sphenix/lustre01/sphnxpro/physics/slurp/tracking/$(build)_$(tag)/run_$(rungroup)" | ||
| logdir : "file:///sphenix/data/data02/sphnxpro/trackinglogs/$(build)_$(tag)/run_$(rungroup)" | ||
| histdir : "/sphenix/data/data02/sphnxpro/hitsethist/$(build)_$(tag)/run_$(rungroup)" | ||
| condor : "/tmp/trkrogs/$(build)_$(tag)/run_$(rungroup)" | ||
|
|
||
| job: | ||
| executable : "{payload}/run.sh" | ||
| arguments : "$(nevents) {outbase} {logbase} $(run) $(seg) {outdir} $(build) $(tag) $(inputs) $(ranges) {logdir} {histdir} {PWD} {rsync}" | ||
| output_destination : '{logdir}' | ||
| log : '{condor}/{logbase}.condor' | ||
| accounting_group : "group_sphenix.mdc2" | ||
| accounting_group_user : "sphnxpro" | ||
| priority : '3800' | ||
|
|
||
|
|
||
| #_____________________________________________________________________________________________________________________________ | ||
|
|
||
| DST_TRKR_CLUSTER_SET_run2pp_2024p007: | ||
| # DST_EVENT works from a pre-built set of run lists. | ||
| params: | ||
| name: DST_TRKR_CLUSTER_run2pp | ||
| build: new | ||
| build_name: new | ||
| dbtag: 2024p007 | ||
| logbase : $(name)_$(build)_$(tag)-$INT(run,{RUNFMT})-$INT(seg,{SEGFMT}) | ||
| outbase : $(name)_$(build)_$(tag) | ||
| script : run_job0.sh | ||
| payload : ./slurp-examples/sPHENIX/TrackingProduction/ | ||
| mem : 2048MB | ||
| nevents : 0 | ||
| rsync : "./slurp-examples/sPHENIX/TrackingProduction/*,cups.py,bachi.py,odbc.ini" | ||
|
|
||
| input: | ||
| db: fc | ||
| query: |- | ||
| select | ||
| 'filecatalog/datasets' as source , | ||
| runnumber , | ||
| segment , | ||
| filename as files , | ||
| 'X' as fileranges | ||
| from | ||
| datasets | ||
| where | ||
| filename like 'DST_TRKR_HIT_run2pp_new_2024p007%' | ||
| {run_condition} | ||
| and runnumber>=49700 | ||
| order by runnumber | ||
| {limit_condition} | ||
| ; | ||
| filesystem: | ||
| outdir : "/sphenix/lustre01/sphnxpro/physics/slurp/tracking/$(build)_$(tag)/run_$(rungroup)" | ||
| logdir : "file:///sphenix/data/data02/sphnxpro/trackinglogs/$(build)_$(tag)/run_$(rungroup)" | ||
| histdir : "/sphenix/data/data02/sphnxpro/clusterhist/$(build)_$(tag)/run_$(rungroup)" | ||
| condor : "/tmp/trkrlogs/$(build)_$(tag)/run_$(rungroup)" | ||
|
|
||
| job: | ||
| executable : "{payload}/run_job0.sh" | ||
| arguments : "{nevents} {outbase} {logbase} $(run) $(seg) {outdir} $(build) $(tag) $(inputs) $(ranges) {logdir} {histdir} {PWD} {rsync}" | ||
| output_destination : '{logdir}' | ||
| log : '{condor}/{logbase}.condor' | ||
| accounting_group : "group_sphenix.mdc2" | ||
| accounting_group_user : "sphnxpro" | ||
| priority : '3800' |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a zombie method... removed in previous PRs, but this PR is bringing
it back from the dead.