- 09 Oct, 2015 2 commits
-
-
Morris Jette authored
-
Morris Jette authored
If a job allocation returns some invalid contents, the pointer to the job structure may be NULL. This change preserves the error message and avoids a segv.
-
- 08 Oct, 2015 2 commits
-
-
Brian Christiansen authored
Fix case where if the backup slurmdbd has existing connections when it gives up control that the it would be killed. If the backup had existing connections when giving up control, it would try to signal the existing threads by using pthread_kill to send SIGKILL to the threads. The problem is that SIGKILL doesn't go the thread but the main process and the backup dbd would be killed.
-
Danny Auble authored
when a cold-start (-c) happens to the slurmctld.
-
- 07 Oct, 2015 14 commits
-
-
Danny Auble authored
Conflicts: src/sacct/options.c
-
Danny Auble authored
-
Danny Auble authored
from a user. This would cause the slurmctld to cache the old default which wasn't valid and cause the user to have to request the association always.
-
Danny Auble authored
Conflicts: NEWS src/plugins/accounting_storage/mysql/as_mysql_job.c
-
Morris Jette authored
bug 2009
-
Morris Jette authored
Each node could have fewer tasks allocated on a node than the plane size, which broke the test. The plane size needs to be treated as a maximum consecutive rank value.
-
Thomas Cadeau authored
-
Morris Jette authored
-
Morris Jette authored
byg 2013
-
David Bigagli authored
-
Hongjia Cao authored
-
David Bigagli authored
-
Hongjia Cao authored
-
Danny Auble authored
database but the start record hadn't made it yet.
-
- 06 Oct, 2015 13 commits
-
-
Morris Jette authored
Create a "task" cgroup at job allocation time via the prolog container. A dummy "sleep" process will occupy the cgroup so long as the job exits. bug 1994
-
Morris Jette authored
-
Morris Jette authored
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
requirements.
-
Axel Auweter authored
Add acct_gather_energy/ibmaem plugin for systems with IBM Systems Director Active Energy Manager.
-
Morris Jette authored
Conflicts: src/slurmctld/job_mgr.c
-
Thomas Cadeau authored
bug 2011
-
jette authored
It would not cause any problem other than excess memory being allocated, but was found by CLANG.
-
Danny Auble authored
','.
-
Morris Jette authored
Conflicts: src/common/proc_args.c
-
Morris Jette authored
bug 1999
-
- 05 Oct, 2015 4 commits
-
-
Morris Jette authored
A configuration of "DefMemPerNode=UNLIMITED" prevented more than one job from running at a time on a given node, which broke some tests. These changes prevent the tests from breaking.
-
david authored
-
jette authored
-
jette authored
-
- 03 Oct, 2015 2 commits
-
-
Morris Jette authored
Conflicts: NEWS
-
Morris Jette authored
Don't requeue RPC going out from slurmctld to DOWN nodes (can generate repeating communication errors). bug 2002
-
- 02 Oct, 2015 3 commits
-
-
Morris Jette authored
-
Morris Jette authored
This will only happen if a PING RPC for the node is already queued when the decision is made to power it down, then fails to get a response for the ping (since the node is already down). bug 1995
-
Morris Jette authored
If a job's CPUs/task ratio is increased due to configured MaxMemPerCPU, then increase it's allocated CPU count in order to enforce CPU limits. Previous logic would increase/set the cpus_per_task as needed if a job's --mem-per-cpu was above the configured MaxMemPerCPU, but NOT increase the min_cpus or max_cpus varilable. This resulted in allocating the wrong CPU count.
-