1. 16 Jan, 2014 2 commits
    • Morris Jette's avatar
      Add version number to node and front-end · 72a5393f
      Morris Jette authored
      Add version number to node and front-end configuration information visible
      using the scontrol tool. Sview and sinfo still need to be changed.
      72a5393f
    • Morris Jette's avatar
      Add core spec count to job credential · f05ba788
      Morris Jette authored
      Add specialized core count field to job credential data.
      NOTE: This changes the communications protocol from other pre-releases of
      version 14.03. All programs must be cancelled and daemons upgraded from
      previous pre-releases of version 14.03. Upgrades from version 2.6 or earlier
      can take place without loss of jobs
      f05ba788
  2. 15 Jan, 2014 2 commits
  3. 13 Jan, 2014 2 commits
  4. 11 Jan, 2014 1 commit
  5. 10 Jan, 2014 1 commit
  6. 09 Jan, 2014 2 commits
  7. 08 Jan, 2014 4 commits
  8. 07 Jan, 2014 3 commits
  9. 06 Jan, 2014 2 commits
    • Morris Jette's avatar
      Reset job priority on manual resume · 65d9196c
      Morris Jette authored
      If a job is explicitly suspended, its priority is set to zero.
      This resets the priority when requeued and also documents that
      if the job is requeued (e.g. due to a node failure), then it
      is placed in a held state.
      65d9196c
    • Morris Jette's avatar
      Correct job RunTime if requeued from suspend state · bc3d8828
      Morris Jette authored
      Without this patch, the job's RunTime includes its RunTime from
      before it's prior suspend (i.e. the job's full RunTime rather than
      just the RunTime of the requeued job).
      bc3d8828
  10. 27 Dec, 2013 1 commit
    • Filip Skalski's avatar
      Fix sched/backfill bug that could starve jobs · 2bae8bd6
      Filip Skalski authored
      Hello,
      
      I think I found another bug in the code (I'm using 2.6.3 but I checked the 2.6.5 and 14.03 versions and it's the same there).
      
      In file sched/backfill/backfill.c:
      
      1)
      _add_reservation function, from lines 1172:
      
      if (placed == true) {
              j = node_space[j].next;
              if (j && (end_reserve < node_space[j].end_time)) {
                      /* insert end entry record */
                      i = *node_space_recs;
                      node_space[i].begin_time = end_reserve;
                      node_space[i].end_time = node_space[j].end_time;
                      node_space[j].end_time = end_reserve;
                      node_space[i].avail_bitmap =
                              bit_copy(node_space[j].avail_bitmap);
                      node_space[i].next = node_space[j].next;
                      node_space[j].next = i;
                      (*node_space_recs)++;
              }
              break;
      }
      I draw a picture with `node_space` state after 2 iterations (see attachment).
      
      In case where the new reservation i...
      2bae8bd6
  11. 23 Dec, 2013 4 commits
  12. 20 Dec, 2013 3 commits
  13. 19 Dec, 2013 1 commit
    • Morris Jette's avatar
      scontrol show job - Correct NumNodes value · b31e2176
      Morris Jette authored
      It has been changed to improve the calculated value for pending
      jobs and use the actual node count value for jobs that have been
      started (including suspended, completed, etc.)
      bug 549
      b31e2176
  14. 18 Dec, 2013 1 commit
  15. 17 Dec, 2013 2 commits
  16. 16 Dec, 2013 1 commit
  17. 14 Dec, 2013 3 commits
  18. 13 Dec, 2013 2 commits
  19. 12 Dec, 2013 2 commits
    • Morris Jette's avatar
      Add job data pack flag to accounting RPC · c1178d8b
      Morris Jette authored
      Without this flag, if the configuration changes or is inconsistent
      between nodes then the pack and unpack can be out of sync in terms
      of what data is expected. This will let server tell the client what
      data is packed.
      c1178d8b
    • Morris Jette's avatar
      slurmstepd variable initialization · 06b41cdc
      Morris Jette authored
      Without this patch, free() is called on a random memory location
      (i.e. whatever is on the stack), which can result in slurmstepd
      dying and a completed job not being purged in a timely fashion.
      06b41cdc
  20. 11 Dec, 2013 1 commit