Skip to content
GitLab
Projects Groups Topics Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
  • Sign in
  • autosubmit autosubmit
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributor statistics
    • Graph
    • Compare revisions
  • Issues 338
    • Issues 338
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 21
    • Merge requests 21
  • Deployments
    • Deployments
    • Releases
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Commits
  • Issue Boards
Collapse sidebar
  • Earth SciencesEarth Sciences
  • autosubmitautosubmit
  • Issues
  • #1426
Closed
Open
Issue created Sep 26, 2024 by Aina Gaya@agayayavMaintainer

How to handle data downloads every 5 minutes

Hi @dbeltran @bdepaula ,

As I think that Francesca explained to you today, for auto-PHENOMENA we will have to download data from a webpage every 5 minutes. We were thinking on how to set that up with Autosubmit.

My first idea would be to use RETRY_DELAY_TIME + soft dependency (FAILED). That would mean:

  1. We start the experiment, let's say, at 00:00. Data is downloaded. Job successful -> Next chunk submitted.
  2. The next chunk is submitted at 00:00+run time of the previous chunk. If more than 5 minutes passed, the task is failed (the data was already downloaded). Autosubmit waits for something like 3 minutes (RETRY_DELAY_TIME) to submit the retrial. In the meantime, it submits the next chunk, that will fail, wait for 3 minutes, and try again.

I see a lot of gaps in my approach. For example, if the first chunk takes more than 5 minutes to run, we would be losing data. But if the soft dependency is with RUNNING, we will have too many "attempts" at the same time, I think.

I believe that @bdepaula had some ideas!

cc @lherrero @fmacchia @ctena @cpinero @avinas (am I missing someone?)

Assignee
Assign to
Time tracking