Closes #939
Creating a merge request to demonstrate the idea of the automated documentation of experiments. This is not super new, as you can achieve almost the same with Cylc and ecFlow. In Cylc you have DOC and TITLE too. In ecFlow they use MANUAL.
I am extending the scope of the issue (created 11 months ago) to add something that was discussed more recently, about the documentation for models, metadata, ...
The idea is documented in the docgen/__init__.py
docstrings. Pasting it here:
Goes over experiment configuration, gathering each job's
``TITLE`` and ``DOC`` to create the documentation for the
jobs.
Goes over the ``METADATA`` configuration entry, looking
for sections with ``key=value`` pairs, or links to
configuration keys, e.g.:
.. code-block:: yaml
METADATA:
MODEL:
# This is a static value that will produce name = IFS
- name: name
# This is a dynamic value that will produce resolution = abc...
- resolution: %RUN.IFS.RESOLUTION%
# This resolves the key and value automatically and will produce RUN.IFS.INIPATH=/path/path/...
- %RUN.IFS.INIPATH%
# This is the extended format
- name: source
value: https://git@...
documentation: |
The source code is hosted at the private repository...
DATA:
- PROVENANCE: ...
Each property of the ``METADATA`` YAML object is rendered as
a separate section. These sections, in YAML, are arrays that
contain metadata references in each line. Extra documentation
about each key can be entered using the extended format.
The gist is that the maintainers of the experiment would be able to select configuration values from the workflow that must be exported as workflow metadata (in the METADATA
YAML object/entry above).
The new command in Autosubmit will take care to go over the experiment configuration, resolve variables like %EXPID%
, %RUN.IFS.RESOLUTION.ETC%
, and produce the metadata list, with the current values used in the experiment.
I think this is key here, to have the documentation generated with the values used in the workflow -- i.e. the workflow run configuration is the single source of truth, to avoid having a PDF or Wiki page manually written that might be out of sync with what's actually happening in the workflow.