Skip to content
GitLab
Projects Groups Topics Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
  • Sign in
  • autosubmit autosubmit
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributor statistics
    • Graph
    • Compare revisions
  • Issues 338
    • Issues 338
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 21
    • Merge requests 21
  • Deployments
    • Deployments
    • Releases
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Commits
  • Issue Boards
Collapse sidebar
  • Earth SciencesEarth Sciences
  • autosubmitautosubmit
  • Issues
  • #657
Closed
Open
Issue created Feb 02, 2021 by Gilbert@gmontaneDeveloper

Add support for Nimbus platform

Hi @dbeltran @wuruchi

I'm having a weird error in experiment a3g0 using 3.13 (in 3.12 it works fine):

[ERROR] Trace: local variable 'status' referenced before assignment
 [CRITICAL] Unhandled error: If you see this message, please report it in Autosubmit's GitLab project

The experiment's HPCARCH is set to Nimbus (AEMET's cluster), but the first jobs are run in the LOCAL platform. Looking at the logs, seems that the error happens after some of the LOCAL jobs change the status from RUNNING to FAILED/COMPLETED. But in fact some of the remote jobs (REMOTE_COMPILE and COMPILE_DA) are successfully run in the remote platform without Autosubmit having apparently noticed it (there is no log regarding these jobs). So I'm not sure if the error happens when the local jobs are finished or when the remote jobs start.

If I try to continue the experiment after it failed (with or without changing the status of failed jobs), autosubmit seems to recover the status of the jobs but the same error happens again. Doing a create and restarting the experiment does not either solve the problem, it crashes at the same point. I've also created some copies of a3g0 (a3g8 and a3ga) but the error still happens.

Turns out that the experiment is the same with which I found the issue reported in #653 (closed), so I don't know if it could be related (maybe a corruption due to having copied the experiment with 3.12 and then migrated it to 3.13?).

I'm able to run other experiments using other platforms, so it must be something with this particular experiment or the platform.

Let me know if you need more info from my side to investigate it.

Edited Feb 11, 2021 by dbeltran
Assignee
Assign to
Time tracking