- 10 Dec, 2018 1 commit
-
-
Morris Jette authored
The cpu frequency set by the user is not exact with current kernels. There seems to be a fair variation depending upon timing and other events. This is resulting in test1.76 failing sporatically. This changes the logic to retry if the frequency differs by more than 10 percent rather than failing immediately.
-
- 08 Dec, 2018 1 commit
-
-
Marshall Garey authored
Bug 6029
-
- 07 Dec, 2018 2 commits
-
-
Nate Rini authored
Only print a warning for 18.08. If a user has SLURM_MEM_PER_CPU or SLURM_MEM_PER_NODE environment variables set for some reason this situation could be happening by accident, and we don't want to prevent the srun command from launching steps at this point. Bug 6058.
-
Broderick Gardner authored
Bug 5648.
-
- 06 Dec, 2018 8 commits
-
-
Janne Blomqvist authored
The Linux kernel default hard limit of 4096 for the number of file descriptors is quite small. Debian/Ubuntu have for a long time overridden this, increasing it to 1M. Recently systemd also bumped the default to 512k. https://github.com/systemd/systemd/blob/master/NEWS https://github.com/systemd/systemd/pull/10244 https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/thread/ZN5TK3D6L7SE46KGXICUKLKPX2LQISVX/ https://github.com/systemd/systemd/commit/09dad04c49cae3ad2b319c9b4e7773fedd34309a Here the limits are increased as follows: - slurmd: 128k; some workloads like Hadoop/Spark need a lot of fd's, and recommend that the limit is increased to at least 64k. - slurmctld: 64k; per the Slurm high throughput and big system guides which recommend a file-max of at least 32k. - slurmdbd: 64k, matching slurmctld, though slurmdbd shouldn't need that many fd's, bumping the limit shouldn't hurt either. Bug 6171
-
Tim Wickberg authored
Bug 5248
-
Mike Nolta authored
Bug 6055
-
Mike Nolta authored
Add the following slurmctld return codes to the lua plugin: ESLURM_ACCESS_DENIED ESLURM_ACCOUNTING_POLICY ESLURM_INVALID_NODE_COUNT ESLURM_JOB_MISSING_SIZE_SPECIFICATION ESLURM_MISSING_TIME_LIMIT Bug 6055
-
Marshall Garey authored
Bug 5836.
-
Marshall Garey authored
-
Nate Rini authored
-
Nate Rini authored
Bug 6162
-
- 05 Dec, 2018 16 commits
-
-
Albert Gil authored
Bug 6163
-
Felip Moll authored
Backups already run it when dropping to backup. Bug 6098.
-
Marshall Garey authored
Remove the README and point to the web page. Add details on the disable_x11 option. Bug 5936.
-
Marshall Garey authored
Also throw an error message within stepd_available() if the nodename is not set or cannot be inferred correctly. Bug 5399.
-
Nate Rini authored
-
Brian Christiansen authored
-
Brian Christiansen authored
Spelling, suggestions, trailing whitespace.
-
Nate Rini authored
Bug 6044
-
Trey Dockendorf authored
Bug 6120
-
Albert Gil authored
-
Tim Wickberg authored
Bug 6155
-
Tim Wickberg authored
Bug 6155
-
Felip Moll authored
When bf_continue is set, and locks are released during a backfill cycle, other operations can make new resorces available while part way through the queue. When backfill continues the cycle and evaluates new jobs, it may allocate some of these newly available resources to lower priority jobs, rather than to higher priority jobs that were already considered in this backfill cycle. This patch introduces bf_ignore_newly_avail_nodes to SchedulerParameters to solve this issue. This option will ignore nodes made available when the backfill scheduler yields when resuming the backfill cycle. Bug 5279.
-
Danny Auble authored
-
Danny Auble authored
slurmd yet delivering it's TRES list. Bug 6122 Co-authored-by: Marshall Garey <marshall@schedmd.com>
-
Morris Jette authored
-
- 04 Dec, 2018 9 commits
-
-
Nate Rini authored
Bug 6008
-
Morris Jette authored
then an error is generated if more than one of those specifications contains KNL NUMA or MCDRAM modes. Bug 5846
-
Morris Jette authored
Bug 5846
-
Morris Jette authored
are down nodes. Bug 5846
-
Morris Jette authored
NODE_SET_REBOOT to continue. Bug 5846
-
Morris Jette authored
node change when possible. Bug 5846
-
Tim Wickberg authored
Break out a list of Linux distributions as well.
-
Marshall Garey authored
Plugins reading in their own config files rely on the SLURM_CONF environment variable pointing to the appropriate directory, otherwise they will fall back to the build in sysconfdir path. Set the environment variable early enough so that the -f flag operates correctly, but not before conf->conffile has definitely been set. Remove the setenv call that happens before the first slurmstepd is fork()'d as it is now redundant. Bug 4774.
-
Alejandro Sanchez authored
sbatch sets these, but salloc did not. This should make srun behavior between the two consistent. Bug 3861.
-
- 03 Dec, 2018 2 commits
-
-
Marshall Garey authored
time that wasn't existent instead of just updating lines that have time with a lesser time.
-
Dominik Bartkiewicz authored
Slurm is going to replace internally. Bug 5800
-
- 29 Nov, 2018 1 commit
-
-
Dominik Bartkiewicz authored
Bug 6121
-