- 12 Dec, 2014 2 commits
-
-
Danny Auble authored
-
Danny Auble authored
-
- 11 Dec, 2014 8 commits
-
-
Danny Auble authored
If a QOS was added for the job and then removed and it just happened to be the largest QOS id wise if the slurmctld was restarted and the job wasn't flushed out yet it could mess things up.
-
David Bigagli authored
-
Morris Jette authored
Log how many nodes are removed from consideration from jobs due to advanced reservation. Change user error message to indicated that required nodes might be down, drained or (added this bit) reserved.
-
Morris Jette authored
In proctrack/linuxproc and proctrack/pgid, check the result of strtol() for error condition rather than errno, which might have a vestigial error code.
-
Danny Auble authored
correctly.
-
Danny Auble authored
accounting_storage/filetxt.
-
Morris Jette authored
The task_dist_states variable has been split into "flags" and "base" components. Added SLURM_DIST_PACK_NODES and SLURM_DIST_NO_PACK_NODES values to give user greater control over task distribution. The srun --dist options has been modified to accept a "Pack" and "NoPack" option. These options can be used to override the CR_PACK_NODE configuration option.
-
Danny Auble authored
to have all the limits a QOS has. If a limit is set in both QOS the partition QOS will override the job's QOS unless the job's QOS has the 'PartitionQOS' flag set.
-
- 09 Dec, 2014 2 commits
-
-
Morris Jette authored
-
Danny Auble authored
when running from cache.
-
- 08 Dec, 2014 6 commits
-
-
David Bigagli authored
-
Brian Christiansen authored
Bug 1305
-
Morris Jette authored
Fix bug with GRES having multiple types that can cause slurmctld abort. This can be reproduced with select/cons_res and one Gres like this: Name=gpu Type=kepler File=/dev/tty0 A bad index was being used that caused an assert.
-
David Bigagli authored
-
Morris Jette authored
-
Artem Polyakov authored
Logic introdiced in version 14.03.10 to support requeueing of jobs with GRES allocated to currently running steps broke select/linear due to differernces in the plugin logic. The commit with the bad logic is 1209a664
-
- 05 Dec, 2014 5 commits
-
-
Brian Christiansen authored
Bug 1298
-
Brian Christiansen authored
-
Brian Christiansen authored
Bug 1301
-
Danny Auble authored
have no weight. This allows for association and QOS decay limits to work.
-
Danny Auble authored
Before without this option accounting would not be correct unless the job was allocating enough resources to fill up the socket. Making this the default makes it so the entire socket is allocated to job similar to the way CR_CORE works with allocating the entire core to the job even if they don't allocate the whole core.
-
- 04 Dec, 2014 6 commits
-
-
David Bigagli authored
draining in sinfo output.
-
Brian Christiansen authored
-
Brian Christiansen authored
Fix jobs from starting in overlapping reservations that won't finish before a "maint" reservation begins. Bug 1290
-
Morris Jette authored
Avoid huge malloc if GRES configured with "Type" and huge "Count".
-
Danny Auble authored
when the DBD is down.
-
Danny Auble authored
-
- 03 Dec, 2014 3 commits
-
-
Morris Jette authored
Log Cray MPI job calling exit() without mpi_fini(), but do not treat it as a fatal error. This partially reverts logic added in version 14.03.9. bug 1171
-
Brian Christiansen authored
Bug 1289
-
Danny Auble authored
could result in seg fault.
-
- 02 Dec, 2014 5 commits
-
-
Morris Jette authored
This only enables this configuration without providing the necessary infrastructure to support it. Add preempt_by_qos flag to select/cons_res In select/cons_res add extra row to scheduling array if preempt/qos is used. This is used only when a job is started via preempting another job in the same partition, but a lower priority QOS. Update preemption documentation
-
Danny Auble authored
better.
-
David Bigagli authored
-
Danny Auble authored
in BASIL was changed.
-
Brian Christiansen authored
-
- 28 Nov, 2014 1 commit
-
-
Dominik Bartkiewicz authored
In srun the default timeout in _wait_nodes_ready function is 3 sec. This patch make timeout adaptiv, for default max_delay=400 sec it make less rpc request and give beter 'user feelings'.
-
- 27 Nov, 2014 1 commit
-
-
Brian Christiansen authored
Bug #1276
-
- 26 Nov, 2014 1 commit
-
-
David Bigagli authored
in commit 1953fdecbb.
-