Closes #929 (closed)
Add ro-crate-py as dependency, and produce an initial RO-Crate ZIP file with the metadata and original workflow configuration.
For more about RO-Crate: https://www.researchobject.org/ro-crate/
To test it:
- Install Autosubmit using this branch (
rocrate
), withpip install -e .
(in a venv? conda/mamba env?) https://autosubmit.readthedocs.io/en/master/installation/index.html - Create an experiment for mHM following installation instructions from the README here: https://github.com/kinow/auto-mhm-test-domains/blob/fb57531b52b1e2a2dbd6a1c53a64c954fcca4e5d/README.md (Use the branch from this PR instead of main: https://github.com/kinow/auto-mhm-test-domains/pull/12/files)
- The command above from the README (i.e.
autosubmit expid ....
) will give you the experiment ID (hereafter<expid>
) - You may also need to build the container used by the workflow, see README https://github.com/kinow/auto-mhm-test-domains/blob/fb57531b52b1e2a2dbd6a1c53a64c954fcca4e5d/README.md
- Prepare the Autosubmit experiment (it will git-clone the remote repo)
autosubmit create <expid>
- Run the workflow with
autosubmit run <expid>
- Now create the RO-Crate:
autosubmit -lc DEBUG archive --rocrate <expid>
- Unzip and check the JSON to inspect the metadata
unzip -p ~/autosubmit/<expid>/rocrate.zip ro-crate-metadata.json
> ~/autosubmit//ro-crate-metadata.json`
Progress:
-
Use ro-crate-py Python library -
Unify the YAML configuration of Autosubmit and produce a single workflow.yml
(prospective provenance) -
Produce an RO-Crate zip file with the JSON metadata, and the workflow.yml
(correctly linked)-
Add authors, license, keywords (from an external YAML like COMPSs) -
Fetch the exp description from the DB -
Add logs as traces (retrospective provenance, all other items added below) -
Add plot if present -
Add databases -
Add pickle files -
Add inputs and outputs (use a convention, see https://github.com/ResearchObject/ro-crate-py/issues/148#issuecomment-1460482169) -
Have a look if we can add system usage (energy, memory, nodes, etc.) as per this comment https://github.com/ResearchObject/workflow-run-crate/issues/10#issuecomment-1456168053 (might be easier to do that in a follow-up, as that discussion is not closed yet)
-
-
Find a good public workflow to produce an RO-Crate and validate with the RO-Crate community -
Validate the RO-Crate -
Write tests -
Write docs -
Test archive & unarchive