Skip to content
GitLab
Projects Groups Topics Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
  • Sign in
  • autosubmit autosubmit
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributor statistics
    • Graph
    • Compare revisions
  • Issues 338
    • Issues 338
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 21
    • Merge requests 21
  • Deployments
    • Deployments
    • Releases
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Commits
  • Issue Boards
Collapse sidebar
  • Earth SciencesEarth Sciences
  • autosubmitautosubmit
  • Merge requests
  • !317

Produce RO-Crate archive with Autosubmit

  • Review changes

  • Download
  • Patches
  • Plain diff
Merged Bruno de Paula Kinoshita requested to merge rocrate into master Feb 16, 2023
  • Overview 119
  • Commits 1
  • Pipelines 0
  • Changes 15

Closes #929 (closed)

Add ro-crate-py as dependency, and produce an initial RO-Crate ZIP file with the metadata and original workflow configuration.

For more about RO-Crate: https://www.researchobject.org/ro-crate/

To test it:

  1. Install Autosubmit using this branch (rocrate), with pip install -e . (in a venv? conda/mamba env?) https://autosubmit.readthedocs.io/en/master/installation/index.html
  2. Create an experiment for mHM following installation instructions from the README here: https://github.com/kinow/auto-mhm-test-domains/blob/fb57531b52b1e2a2dbd6a1c53a64c954fcca4e5d/README.md (Use the branch from this PR instead of main: https://github.com/kinow/auto-mhm-test-domains/pull/12/files)
  3. The command above from the README (i.e. autosubmit expid ....) will give you the experiment ID (hereafter <expid>)
  4. You may also need to build the container used by the workflow, see README https://github.com/kinow/auto-mhm-test-domains/blob/fb57531b52b1e2a2dbd6a1c53a64c954fcca4e5d/README.md
  5. Prepare the Autosubmit experiment (it will git-clone the remote repo) autosubmit create <expid>
  6. Run the workflow with autosubmit run <expid>
  7. Now create the RO-Crate: autosubmit -lc DEBUG archive --rocrate <expid>
  8. Unzip and check the JSON to inspect the metadata unzip -p ~/autosubmit/<expid>/rocrate.zip ro-crate-metadata.json > ~/autosubmit//ro-crate-metadata.json`

Progress:

  • Use ro-crate-py Python library
  • Unify the YAML configuration of Autosubmit and produce a single workflow.yml (prospective provenance)
  • Produce an RO-Crate zip file with the JSON metadata, and the workflow.yml (correctly linked)
    • Add authors, license, keywords (from an external YAML like COMPSs)
    • Fetch the exp description from the DB
    • Add logs as traces (retrospective provenance, all other items added below)
    • Add plot if present
    • Add databases
    • Add pickle files
    • Add inputs and outputs (use a convention, see https://github.com/ResearchObject/ro-crate-py/issues/148#issuecomment-1460482169)
    • Have a look if we can add system usage (energy, memory, nodes, etc.) as per this comment https://github.com/ResearchObject/workflow-run-crate/issues/10#issuecomment-1456168053 (might be easier to do that in a follow-up, as that discussion is not closed yet)
  • Find a good public workflow to produce an RO-Crate and validate with the RO-Crate community
  • Validate the RO-Crate
  • Write tests
  • Write docs
  • Test archive & unarchive
Edited Aug 22, 2023 by Bruno de Paula Kinoshita
Assignee
Assign to
Reviewers
Request review from
Time tracking
Source branch: rocrate