- 07 Dec, 2016 2 commits
-
-
Morris Jette authored
-
Danny Auble authored
-
- 06 Dec, 2016 22 commits
-
-
Morris Jette authored
-
Morris Jette authored
The test is still failing for me (along with a bunch of others), but this change at least gets the test started.
-
Morris Jette authored
-
Danny Auble authored
a slurmctld restart or reconfig, as they aren't really error messages. Bug 3258
-
Danny Auble authored
Bug 3258
-
Danny Auble authored
-
Morris Jette authored
-
Morris Jette authored
Done jost to run "git push" again after internal github error on previous push: remote: Resolving deltas: 100% (4/4), completed with 4 local objects. remote: Unexpected system error after push was received. remote: These changes may not be reflected on github.com! remote: Your unique error code: bdecb7b0f321368fe1f037a81a6e9c2c
-
Morris Jette authored
This restores the socket count check at node registration for non-KNL systems (at least systems without NodeFeaturesPlugins type that includes "knl"). This is a refinement of commit 1ce9a7c4
-
Tim Wickberg authored
-
Tim Wickberg authored
Note that this does not protect against all possible problems here. The setgroups() call in Linux at least is willing to set any gid_t value except -1 on a group, so calls will not always fail on corrupted group lists. Bug 3320.
-
Tim Wickberg authored
-
Tim Wickberg authored
Remove uncached _get_grouplist() call which was only used here. Bug 3315.
-
Morris Jette authored
-
Morris Jette authored
test12.2 was consistently failing on smd# cluster with tiny differences in the disk read and written. This change permits those tiny discrepancies to exist without failing the test. Here are the numbers, which are consistent: sacct --noheader -p --job=763.0 --format MaxDiskWrite,AveDiskWrite,MaxDiskRead,AveDiskRead 10.00M|10.00M|10.03M|10.03M| (i.e. 0.3% discrepancy, up to 0.5% allowed with current code)
-
Morris Jette authored
Test was failing due to hitting the memory limit (at least on smd1).
-
Morris Jette authored
There were already several configurations that could cause this test to fail. I just added another.
-
Danny Auble authored
-
Danny Auble authored
-
Morris Jette authored
-
Morris Jette authored
Fix parsing in regression test1.92 for some prompts. bug 2792
-
Morris Jette authored
Recognize a KNL's proper NUMA count (rather than setting it to the value in slurm.conf) when using FastSchedule=0. Previous logic would change the NUMA count on the node to match what was in slurm.conf, which would mess up task layout with respect to the sockets. bug 3306
-
- 05 Dec, 2016 10 commits
-
-
Morris Jette authored
HWLOC is required to properly determine topology
-
Tim Wickberg authored
-
Brian Christiansen authored
Continuation of 1f607747
-
Brian Christiansen authored
This allows a job_requeue to respond on a persistent connection if needed.
-
Alejandro Sanchez authored
node state.
-
Danny Auble authored
# Conflicts: # src/slurmctld/node_mgr.c
-
Danny Auble authored
from the slurm.conf when using FastSchedule=0.
-
Morris Jette authored
-
Morris Jette authored
cray/burst_buffer - If slurmctld daemon restarts with pending job and burst buffer having unknown file stage-in status, teardown the buffer, defer the job, and start stage-in over again. bug 3295
-
Morris Jette authored
Add more detail to log message and change from error to debug2 with an explanation of how this happens
-
- 02 Dec, 2016 6 commits
-
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
bug 3314
-
Danny Auble authored
-
Danny Auble authored
-