1. 14 Dec, 2015 2 commits
    • Morris Jette's avatar
      Decrease scancel parallelism · 53c0078c
      Morris Jette authored
      Decrease parallelism in job cancel request to prevent denial of service
          when cancelling huge numbers of jobs.
      bug 2256
      53c0078c
    • Morris Jette's avatar
      Prevent gang scheduling with preemption configured · 44f491b8
      Morris Jette authored
      Prevent triggering gang scheduling within a partition if configured with
          PreemptType=partition_prio and PreemptMode=suspend,gang.
      The essence of this fix is to change a "<=" to "<" in cons_res/job_test.c:
      -               if ((p_ptr->part_ptr->priority <= jp_ptr->part_ptr->priority) &&
      +               if ((p_ptr->part_ptr->priority < jp_ptr->part_ptr->priority) &&
      but logic was also added to insure that a partition configuration with
      PreemptMode did not override PreemptType != partition_prio.
      bug 2232
      44f491b8
  2. 11 Dec, 2015 4 commits
    • Tim Wickberg's avatar
      Rework messages when falling back to older directory format for job environment · 6d9a752c
      Tim Wickberg authored
      Previously an error() would be logged when the attempt to open the job
      script using the new directory format failed but the successive fallback to the
      old directory structure was successful, leading to confusion when troubleshooting.
      
      Move emitted warnings to debug(), and only error() after failing to open in both
      directory structures. Add a note about backwards compatibility to both functions -
      we cannot remove these fallbacks as directory structure for pending jobs does not
      change on Slurm version update, and people may need to chain multiple version
      update together to get to a current slurm version which would correctly update
      slurmctld state files but leave pending jobs in the old directory structure.
      
      Bug #2244.
      6d9a752c
    • Morris Jette's avatar
      slurmd job clean up if requeued during launch · 58c17d4f
      Morris Jette authored
      If a job is requeued while in the process of being launch, remove it's
          job ID from slurmd's record of active jobs in order to avoid generating a
          duplicate job ID error when launched for the second time (which would
          drain the node).
      bug 2240
      58c17d4f
    • Morris Jette's avatar
      Improve slurmctld log · 1d773209
      Morris Jette authored
      In slurmctld log file, log duplicate job ID found by slurmd. Previously was
          being logged as prolog/epilog failure.
      bug 2240
      1d773209
    • Morris Jette's avatar
      Start NEWS for v15.08.6 · 78e2a79e
      Morris Jette authored
      78e2a79e
  3. 10 Dec, 2015 2 commits
  4. 09 Dec, 2015 4 commits
  5. 08 Dec, 2015 2 commits
  6. 07 Dec, 2015 1 commit
  7. 05 Dec, 2015 2 commits
  8. 04 Dec, 2015 2 commits
  9. 03 Dec, 2015 5 commits
  10. 02 Dec, 2015 1 commit
  11. 01 Dec, 2015 4 commits
  12. 30 Nov, 2015 5 commits
  13. 26 Nov, 2015 1 commit
  14. 25 Nov, 2015 1 commit
  15. 23 Nov, 2015 1 commit
  16. 19 Nov, 2015 3 commits