1. 14 Oct, 2014 15 commits
  2. 13 Oct, 2014 4 commits
  3. 11 Oct, 2014 5 commits
  4. 10 Oct, 2014 16 commits
    • David Bigagli's avatar
      Switch order of if in sacctmgr. · 9a49f7f5
      David Bigagli authored
      9a49f7f5
    • Danny Auble's avatar
      6bf40ed9
    • Morris Jette's avatar
      Major update to license test · 1e9761f0
      Morris Jette authored
      Major update to license management tests.
      "Manager" was changed to "ServerType" and several tests made more
      complete and improved reporting of failures.
      1e9761f0
    • Morris Jette's avatar
      Change "Mmanager" to "ServerType" in sacctmgr · 93a6d77f
      Morris Jette authored
      For the sacctmgr command, the keyword "Manager" was changed to
      "ServerType" in some, but not all places. This changes the
      previously unchanged places.
      93a6d77f
    • Morris Jette's avatar
      Disable sgather test with POE · 98bb8dcd
      Morris Jette authored
      The test keeps failing due to a POE bug
      98bb8dcd
    • Morris Jette's avatar
      Advanced reservation test fix · e1b451e2
      Morris Jette authored
      This fixes the advanced reservation test with a configuration that
      sets a node's CPU count to be equal to the core count rather than
      its thread count.
      e1b451e2
    • Morris Jette's avatar
      Fix test for message format change · 001be7f9
      Morris Jette authored
      001be7f9
    • Morris Jette's avatar
      Modify license web page formatting · 14b3502b
      Morris Jette authored
      The original formatting had a bunch of lists rather than paragraphs,
      the numbers did not add up in the use case, and some wording was
      changed for clarity.
      14b3502b
    • Brian Christiansen's avatar
      5d6a2dc2
    • Dorian Krause's avatar
      Job step memory allocation logic fix · f288e4eb
      Dorian Krause authored
      This commit fixes a bug we observed when combining select/linear with
      gres. If an allocation was requested with a --gres argument an srun
      execution within that allocation would stall indefinitely:
      
      -bash-4.1$ salloc -N 1 --gres=gpfs:100
      salloc: Granted job allocation 384049
      bash-4.1$ srun -w j3c017 -n 1 hostname
      srun: Job step creation temporarily disabled, retrying
      
      The slurmctld log showed:
      
      debug3: StepDesc: user_id=10034 job_id=384049 node_count=1-1 cpu_count=1
      debug3:    cpu_freq=4294967294 num_tasks=1 relative=65534 task_dist=1 node_list=j3c017
      debug3:    host=j3l02 port=33608 name=hostname network=(null) exclusive=0
      debug3:    checkpoint-dir=/home/user checkpoint_int=0
      debug3:    mem_per_node=62720 resv_port_cnt=65534 immediate=0 no_kill=0
      debug3:    overcommit=0 time_limit=0 gres=(null) constraints=(null)
      debug:  Configuration for job 384049 complete
      _pick_step_nodes: some requested nodes j3c017 still have memory used by other steps
      _slurm_rpc_job_step_create for job 384049: Requested nodes are busy
      
      If srun --exclusive would have be used instead everything would work fine.
      The reason is that in exclusive mode the code properly checks whether memory
      is a reserved resource in the _pick_step_node() function.
      This commit modifies the alternate code path to do the same.
      f288e4eb
    • Danny Auble's avatar
      Fix typos · 40bec9cd
      Danny Auble authored
      40bec9cd
    • Dorian Krause's avatar
      Job step memory allocation logic fix · 0dd12469
      Dorian Krause authored
      This commit fixes a bug we observed when combining select/linear with
      gres. If an allocation was requested with a --gres argument an srun
      execution within that allocation would stall indefinitely:
      
      -bash-4.1$ salloc -N 1 --gres=gpfs:100
      salloc: Granted job allocation 384049
      bash-4.1$ srun -w j3c017 -n 1 hostname
      srun: Job step creation temporarily disabled, retrying
      
      The slurmctld log showed:
      
      debug3: StepDesc: user_id=10034 job_id=384049 node_count=1-1 cpu_count=1
      debug3:    cpu_freq=4294967294 num_tasks=1 relative=65534 task_dist=1 node_list=j3c017
      debug3:    host=j3l02 port=33608 name=hostname network=(null) exclusive=0
      debug3:    checkpoint-dir=/home/user checkpoint_int=0
      debug3:    mem_per_node=62720 resv_port_cnt=65534 immediate=0 no_kill=0
      debug3:    overcommit=0 time_limit=0 gres=(null) constraints=(null)
      debug:  Configuration for job 384049 complete
      _pick_step_nodes: some requested nodes j3c017 still have memory used by other steps
      _slurm_rpc_job_step_create for job 384049: Requested nodes are busy
      
      If srun --exclusive would have be used instead everything would work fine.
      The reason is that in exclusive mode the code properly checks whether memory
      is a reserved resource in the _pick_step_node() function.
      This commit modifies the alternate code path to do the same.
      0dd12469
    • Morris Jette's avatar
      Merge branch 'slurm-14.03' · b7cf0a28
      Morris Jette authored
      b7cf0a28
    • Morris Jette's avatar
      Add SUG14 paper link · 70a4091f
      Morris Jette authored
      70a4091f
    • Brian Christiansen's avatar
      2327e268
    • Danny Auble's avatar
      SLURMDBD - Only set the archive flag if purging the object · 686cd117
      Danny Auble authored
      (i.e ArchiveJobs PurgeJobs).  This is only a cosmetic change.
      686cd117