1. 02 May, 2019 2 commits
    • Broderick Gardner's avatar
      Fix resubmit to sibling default on fed requeue · 822fe77e
      Broderick Gardner authored
      On requeue, the origin cluster job record is copied to submit
      to sibling clusters. If the job was originally submitted
      to accept cluster default account, partition, etc, those fields
      are now filled in on the origin. Here we add flags to indicate
      that those fields need to be cleared on resubmission to siblings.
      Bug 6064
      822fe77e
    • Broderick Gardner's avatar
      Fix clearing federation cluster lock on requeue · 47909f8e
      Broderick Gardner authored
      This is a holdover from when the fed job_info list was added.
      The cluster lock has to be cleared from both the job_ptr and
      the job_info.
      Bug 6064
      47909f8e
  2. 30 Apr, 2019 1 commit
  3. 29 Apr, 2019 10 commits
    • Brian Christiansen's avatar
      Update test7.20 to catch passing/failing het jobs · 8c4fdffe
      Brian Christiansen authored
      when one offset passes and other fails.
      
      Bug 6892
      8c4fdffe
    • Nate Rini's avatar
      Add test7.20 · 1460a6b5
      Nate Rini authored
      Bug 6513.
      1460a6b5
    • Brian Christiansen's avatar
      Add NEWS for previous two commits · 00a8e724
      Brian Christiansen authored
      Bug 6513
      00a8e724
    • Brian Christiansen's avatar
      Fix bad sbatch het offset output · 4657ab94
      Brian Christiansen authored
      Bug 6513
      
      First offset is good but second is bad -- didn't request task count.
      
      $ cat etc/job_submit.lua
      function slurm_job_submit(job_desc, part_list, submit_uid)
              slurm.log_user("submit1\nstuff")
              slurm.log_user("submit2")
              slurm.log_user("submit3")
      
          -- slurm.log_user("case 0")
          if job_desc.num_tasks == slurm.NO_VAL or job_desc.num_tasks == nil then
              slurm.log_user("Batch submit error:  Must specify either number of nodes or number of tasks!")
              -- reject the job
              return slurm.ERROR
          end
      
              return slurm.SUCCESS
      end
      
      function slurm_job_modify(job_desc, job_rec, part_list, modify_uid)
              slurm.log_user("modify1")
              slurm.log_user("modify2")
              slurm.log_user("modify3")
              return slurm.SUCCESS
      end
      
      slurm.log_user("initialized")
      return slurm.SUCCESS
      
      $ sbatch -Ablah2 -n1 --wrap="hostname" : -J asdfl
      sbatch: error: 0: initialized
      sbatch: error: 0: submit1
      sbatch: error: 0: stuff
      sbatch: error: 0: submit2
      sbatch: error: 0: submit3
      sbatch: error: submit1
      sbatch: error: stuff
      sbatch: error: submit2
      sbatch: error: submit3
      sbatch: error: Batch submit error:  Must specify either number of nodes or number of tasks!
      sbatch: error: Batch job submission failed: Unspecified error
      
      $ sbatch -Ablah2 -n1 --wrap="hostname" : -J asdfl
      sbatch: error: 0: initialized
      sbatch: error: 0: submit1
      sbatch: error: 0: stuff
      sbatch: error: 0: submit2
      sbatch: error: 0: submit3
      sbatch: error: 1: submit1
      sbatch: error: 1: stuff
      sbatch: error: 1: submit2
      sbatch: error: 1: submit3
      sbatch: error: 1: Batch submit error:  Must specify either number of nodes or number of tasks!
      sbatch: error: Batch job submission failed: Unspecified error
      
      srun already handles this
      4657ab94
    • Nate Rini's avatar
      Break up packed job user messages to prepend index. · a415b8f6
      Nate Rini authored
      Was dumping this:
      $ srun -A test7.21-account.1 --qos test7.21-qos.1 -n5 : -n3 : -n1 /bin/true
      srun: error: 0: submit1
      srun: error: submit2
      srun: error: submit3
      srun: error: Unable to allocate resources: Invalid account or account/partition combination specified
      
      Will now dump this:
      $ srun -A test7.21-account.1 --qos test7.21-qos.1 -n5 : -n3 : -n1 /bin/true
      srun: error: 0: initialized
      srun: error: 0: submit1
      srun: error: 0: submit2
      srun: error: 0: submit3
      srun: error: Unable to allocate resources: Invalid account or account/partition combination specified
      
      Bug 6513.
      a415b8f6
    • Nate Rini's avatar
      Fix printing duplicate error messages of lua rejected jobs · 297a6880
      Nate Rini authored
      Regression from 70b4e06d.
      
      Bug 6892.
      297a6880
    • Nate Rini's avatar
      8920863a
    • Brian Christiansen's avatar
    • Brian Christiansen's avatar
      Fix unnecessary reloading of submit plugins · b50ac244
      Brian Christiansen authored
      Bug 6895
      b50ac244
    • Danny Auble's avatar
      Run autogen.sh with new automake · 7469e9c7
      Danny Auble authored
      7469e9c7
  4. 26 Apr, 2019 9 commits
  5. 24 Apr, 2019 4 commits
  6. 23 Apr, 2019 5 commits
  7. 22 Apr, 2019 2 commits
  8. 21 Apr, 2019 1 commit
  9. 18 Apr, 2019 5 commits
  10. 17 Apr, 2019 1 commit
    • Danny Auble's avatar
      Continuation of 4c48a84a. · 37bb5897
      Danny Auble authored
      Wrong variable was used.  This works fine for 18.08, but in 19.05 this
      code doesn't work correctly.
      
      Bug 6739
      37bb5897