- 12 Aug, 2014 2 commits
-
-
Morris Jette authored
Previously job would only run in first listed partition.
-
Morris Jette authored
Fix gang scheduling for jobs submitted to multiple partitions. Previous logic assumed the job's "partition" field contained a single partition name, that in which the job is running. That was recently changed in order to support job's being requeued, which we want to be runable in all of it's valid partitions.
-
- 08 Aug, 2014 4 commits
-
-
Thomas Cadeaux authored
-
Morris Jette authored
-
Danny Auble authored
signal 1.
-
Danny Auble authored
done for normal steps.
-
- 07 Aug, 2014 2 commits
-
-
Danny Auble authored
of acting like it is a signal and exitcode.
-
Danny Auble authored
previous it was always -2.
-
- 06 Aug, 2014 4 commits
-
-
Danny Auble authored
-
Danny Auble authored
-
Morris Jette authored
Apply BatchStartTimeout configuration to task launch and avoid aborting srun commands due to long running Prolog scripts. bug 978
-
Morris Jette authored
When nodes scheduled for reboot, set state to DOWN rather than FUTURE so they are still visible to sinfo. State set to IDLE after reboot completes. bug 1007
-
- 05 Aug, 2014 1 commit
-
-
David Bigagli authored
-
- 01 Aug, 2014 3 commits
-
-
David Bigagli authored
"job_comp/mysql" setting an incorrect default database.
-
David Bigagli authored
-
David Bigagli authored
database index for the array elements avoiding duplicate database values.
-
- 31 Jul, 2014 1 commit
-
-
Franco Broi authored
-
- 30 Jul, 2014 1 commit
-
-
David Bigagli authored
job elapsed time.
-
- 29 Jul, 2014 1 commit
-
-
David Bigagli authored
the i/o thread.
-
- 28 Jul, 2014 2 commits
-
-
David Bigagli authored
-
David Bigagli authored
exit code.
-
- 24 Jul, 2014 1 commit
-
-
Danny Auble authored
information wasn't stored in accounting.
-
- 23 Jul, 2014 3 commits
-
-
Danny Auble authored
-
Danny Auble authored
job/step completion.
-
Danny Auble authored
bit_unfmt. Signed-off-by: Danny Auble <da@schedmd.com>
-
- 22 Jul, 2014 2 commits
-
-
David Bigagli authored
HealthCheckProgram for nodes in any other states than IDLE.#978
-
Morris Jette authored
Unload job tables rather than windows at job end. The table unload also unloads job tables.
-
- 19 Jul, 2014 1 commit
-
-
Morris Jette authored
-
- 18 Jul, 2014 6 commits
-
-
David Bigagli authored
lost should the slurmctld restart.
-
David Bigagli authored
-
Morris Jette authored
Correct NumCPUs count for jobs with --exclusive option. bug 909
-
Morris Jette authored
This probably only happens on native Cray systems due to the deallocation delays related to node health check. In any case, the symptom is error message of this sort "job # dealloc of node ... bad node_offset 0 count is 0". It then fails to deallocate the nodes GRES back for use by other jobs. bug 973
-
Danny Auble authored
slurm_conf_reinit.
-
Danny Auble authored
counting as multiple nodes.
-
- 17 Jul, 2014 3 commits
-
-
Gennaro Oliva authored
-
Morris Jette authored
-
David Bigagli authored
slurmstepd attempts to create it, for example left over from a previous requeue or crash, delete it and recreate it. #961.
-
- 16 Jul, 2014 2 commits
-
-
David Bigagli authored
-
Morris Jette authored
switch/nrt - Unload job tables (in addition to windows) in user space mode to avoid leaking NRT. bug 964
-
- 15 Jul, 2014 1 commit
-
-
Morris Jette authored
Fix race condition which could result in requeue if batch job exit and node registration occur at the same time.
-