- 06 Dec, 2016 18 commits
-
-
Morris Jette authored
The test is still failing for me (along with a bunch of others), but this change at least gets the test started.
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
Done jost to run "git push" again after internal github error on previous push: remote: Resolving deltas: 100% (4/4), completed with 4 local objects. remote: Unexpected system error after push was received. remote: These changes may not be reflected on github.com! remote: Your unique error code: bdecb7b0f321368fe1f037a81a6e9c2c
-
Morris Jette authored
This restores the socket count check at node registration for non-KNL systems (at least systems without NodeFeaturesPlugins type that includes "knl"). This is a refinement of commit 1ce9a7c4
-
Tim Wickberg authored
-
Tim Wickberg authored
Note that this does not protect against all possible problems here. The setgroups() call in Linux at least is willing to set any gid_t value except -1 on a group, so calls will not always fail on corrupted group lists. Bug 3320.
-
Tim Wickberg authored
-
Tim Wickberg authored
Remove uncached _get_grouplist() call which was only used here. Bug 3315.
-
Morris Jette authored
-
Morris Jette authored
test12.2 was consistently failing on smd# cluster with tiny differences in the disk read and written. This change permits those tiny discrepancies to exist without failing the test. Here are the numbers, which are consistent: sacct --noheader -p --job=763.0 --format MaxDiskWrite,AveDiskWrite,MaxDiskRead,AveDiskRead 10.00M|10.00M|10.03M|10.03M| (i.e. 0.3% discrepancy, up to 0.5% allowed with current code)
-
Morris Jette authored
Test was failing due to hitting the memory limit (at least on smd1).
-
Morris Jette authored
There were already several configurations that could cause this test to fail. I just added another.
-
Danny Auble authored
-
Danny Auble authored
-
Morris Jette authored
-
Morris Jette authored
Fix parsing in regression test1.92 for some prompts. bug 2792
-
Morris Jette authored
Recognize a KNL's proper NUMA count (rather than setting it to the value in slurm.conf) when using FastSchedule=0. Previous logic would change the NUMA count on the node to match what was in slurm.conf, which would mess up task layout with respect to the sockets. bug 3306
-
- 05 Dec, 2016 10 commits
-
-
Morris Jette authored
HWLOC is required to properly determine topology
-
Tim Wickberg authored
-
Brian Christiansen authored
Continuation of 1f607747
-
Brian Christiansen authored
This allows a job_requeue to respond on a persistent connection if needed.
-
Alejandro Sanchez authored
node state.
-
Danny Auble authored
# Conflicts: # src/slurmctld/node_mgr.c
-
Danny Auble authored
from the slurm.conf when using FastSchedule=0.
-
Morris Jette authored
-
Morris Jette authored
cray/burst_buffer - If slurmctld daemon restarts with pending job and burst buffer having unknown file stage-in status, teardown the buffer, defer the job, and start stage-in over again. bug 3295
-
Morris Jette authored
Add more detail to log message and change from error to debug2 with an explanation of how this happens
-
- 02 Dec, 2016 8 commits
-
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
bug 3314
-
Danny Auble authored
-
Danny Auble authored
-
Morris Jette authored
No change in logic, just like up columns in an array structure
-
Morris Jette authored
Add support for SALLOC_CONSTRAINT, SBATCH_CONSTRAINT and SLURM_CONSTRAINT environment variables to set default constraints for salloc, sbatch and srun commands respectively. Bug 3317
-
- 01 Dec, 2016 4 commits
-
-
Dominik Bartkiewicz authored
-
Dominik Bartkiewicz authored
limits after the node selection to make sure it doesn't violate those limits and if it does change the reason for waiting so we don't reserve resources on jobs violating accounting limits. Bug 3029
-
Morris Jette authored
-
Morris Jette authored
-