[DestinE] Communication between two or more different workflows
DestinE phase 2
Cc, @mcastril ,
This issue should contain ideas and implementations about communicating between different workflows.
- Dependencies? Complex(like a workflow graph) or simple( Run workflow b when workflow A does the signal)?
- Launch command should be
autosubmit launch launch_suite.yml
orautosubmit launch a001,a002,a003,a004
? - Signal should be file-based? How do you generate the signal?
- How to set and read configured signals
Simple
a000
JOBS:
SECTION_A:
FILE:
...
SUITE:
METHOD: "ON_COMPLETED"
The signal, setting would be:
- Similar to the "checkpoint" function, we add the function generate_workflow_signal to all cmds
- Users call to add this %WORKFLOW_SIGNAL% in the templates they want and code the logic themselves
Complex
launch_suite.yml located somewhere outside the experiments
Using ASconfigparser, read as_conf.experiment_data["JOBS"] and add it as as_conf.experiment_data["JOBS_%EXPID%] afterwards read the launch_suite.yml
JOBS_A000:
DEPENDENCIES:
jobs_a000.section_a:
job_names: (list)...
or
DATE: ... [n:m], any, all
MEMBER: ...[n:m], any, all
CHUNK: ...[n:m], any, all
SPLIT: ...[n:m], any, all
FROM_STATUS: "COMPLETED" or "RUNNING"
jobs_a000.section_b: # equals to put everything to ALL
jobs_a001.section_a:
jobs_a002.section_a:
JOBS_A001:
...
autosubmit launch
needs:
- A way of detecting which workflows can be created and run. (through reading the yaml )
- A way of setting the dependencies between jobs of different workflows ( through reading the yaml)
- A way of detecting that some workflow has failed jobs.
- What to do? Stop all related experiments?
- A way of stopping and retaking the launch from the previous status.
- A way of detecting finished workflows so they don't run again.
I am not sure if I missed something.