Skip to content
GitLab
Projects Groups Topics Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
  • Sign in
  • autosubmit autosubmit
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributor statistics
    • Graph
    • Compare revisions
  • Issues 338
    • Issues 338
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 21
    • Merge requests 21
  • Deployments
    • Deployments
    • Releases
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Commits
  • Issue Boards
Collapse sidebar
  • Earth SciencesEarth Sciences
  • autosubmitautosubmit
  • Merge requests
  • !500

Gl 1421+1423 splits

  • Review changes

  • Download
  • Patches
  • Plain diff
Merged dbeltran requested to merge gl-1421-Splits-chunksize into master Oct 23, 2024
  • Overview 28
  • Commits 11
  • Pipelines 0
  • Changes 7

Hello @bdepaula , fyi @froura

This merge request means to fix the #1421 (closed) and #1423 (closed) issues.

General:

  • Added recipes folder in the test. These are complete yml files in case we need them in the future or/and to write how-to recipes docs in the future

For #1421 (closed):

I'm not sure what the issue was, I believe that in the tests I did that day it also happened to me but it doesn't happen anymore with the rebased master or what I tested today.

So Basically, If you uncomment the Splitsize: 1

EXPERIMENT:
  DATELIST: 20221101
  MEMBERS: fc0
  CHUNKSIZEUNIT: month
  #SPLITSIZEUNIT: day
  CHUNKSIZE: 2
  NUMCHUNKS: 1
  #SPLITSIZE: 1 # <----
  SPLITPOLICY: flexible
  CHUNKINI: ''
  CALENDAR: standard
PROJECT:
  PROJECT_TYPE: 'none'
PLATFORMS:
  debug:
    type: ps
    host: localhost
JOBS:
  APP:
    SCRIPT: |
        echo "Chunk start date: %CHUNK_START_DATE%"
        echo "Chunk end date: %CHUNK_END_DATE%"
        echo "Split start date: %SPLIT_START_DATE%"
        echo "Split end date: %SPLIT_END_DATE%"
        echo "Split size: %SPLIT_SIZE%"
        echo "Split number: %SPLIT_NUMBER%"
        echo "Chunk size: %CHUNK_SIZE%"
    DEPENDENCIES:
      APP:
        SPLITS_FROM:
          ALL:
            SPLITS_TO: previous
    running: chunk
    SPLITS: auto
  DN:
    SCRIPT: |
      echo "Chunk start date: %CHUNK_START_DATE%"
      echo "Chunk end date: %CHUNK_END_DATE%"
      echo "Split start date: %SPLIT_START_DATE%"
      echo "Split end date: %SPLIT_END_DATE%"
      echo "Split size: %SPLIT_SIZE%"
      echo "Split number: %SPLIT_NUMBER%"
      echo "Chunk size: %CHUNK_SIZE%"
    CHECK: True
    DEPENDENCIES:
      DN:
        SPLITS_FROM:
          ALL:
            SPLITS_TO: previous
      APP:
        SPLITS_FROM:
          ALL: # You want to set all the DN jobs to depend on the last APP split, otherwise the DN would be need to be tuned one by one.
            SPLITS_TO: "%JOBS.APP.SPLITS%"
    SPLITS: auto
    running: chunk

The result is the expected:

Seminar_001_14_34_27

I only found that the default split size value is wrong. It is now always one if not specified instead of taking the chunksize value

For #1423 (closed):

This is a funny one.

The one you posted does not work. Instead, it relies on the natural dependencies code as the split_from: 0 is not understood by Autosubmit. But also, the dependency that you should add doesn't work either ( without this fix )

Currently, you have this:

  DN:
    DEPENDENCIES:
      APP-1:
        SPLITS_FROM:
          0: 
            SPLITS_TO: "auto"
      DN:
        SPLITS_FROM:
          ALL:
            SPLITS_TO: previous
      REMOTE_SETUP:
      SIM:
        STATUS: RUNNING

And you reported that this doesn't work:

  DN:
    DEPENDENCIES:
      APP-1:
        SPLITS_FROM:
          1: 
            SPLITS_TO: "auto"
      DN:
        SPLITS_FROM:
          ALL:
            SPLITS_TO: previous
      REMOTE_SETUP:
      SIM:
        STATUS: RUNNING

But in fact, it is working. However, that is not what you wanted to achieve, as with this, you're not setting the rest of the DN dependencies that can lead to weird graphs.

Also, there was another issue: the "auto" keyword was always being applied to the auto of the current chunk. However, you don't want that in that case; you want to obtain the value of the APP-1 that is different from the current_app. To fix it, I wrote a new function.

So why did your example work?

  DN:
    DEPENDENCIES:
      APP-1:
        SPLITS_FROM:
          0: 
            SPLITS_TO: "auto"
      DN:
        SPLITS_FROM:
          ALL:
            SPLITS_TO: previous
      REMOTE_SETUP:
      SIM:
      

Which in that specific workflow seems to work, is the same as put

  DN:
    DEPENDENCIES:
      APP-1:
      DN:
        SPLITS_FROM:
          ALL:
            SPLITS_TO: previous
      REMOTE_SETUP:
      SIM:
        Status: Running

Because Autosubmit does not understand filter 0, the Autosubmit code jumps to the "natural" dependencies function.

But what you actually wanted to do is:

  DN:
    DEPENDENCIES:
      APP-1:
        SPLITS_FROM:
          ALL: # You want to set all the DN jobs to depend on the last APP split. Otherwise, the DN would need to be tuned one by one.
            SPLITS_TO: "auto"
      DN:
        SPLITS_FROM:
          ALL:
            SPLITS_TO: previous
      REMOTE_SETUP:
      SIM:
        Status: Running

Simplified workflow:

Seminar_001_14_47_56

Additionally, the autosubmit inspect now shows the template paths as the result of the command.

Assignee
Assign to
Reviewers
Request review from
Time tracking
Source branch: gl-1421-Splits-chunksize