- 10 Dec, 2018 2 commits
-
-
Morris Jette authored
-
Michael Hinton authored
Add step_unconfigure_hardware() to GRES plugin API Update test39.18 regarding links. Update GRES docs. Update docs related to links. Document GPU frequency resetting behavior. Specify what the default is for GpuFreqDef. Move NVML init and shutdown to configure() and unconfigure(). Get rid of superfluous `!= 0`-style statements. Print note when GPU index != minor number. Clean up various formatting and other errors. bug 5520
-
- 09 Dec, 2018 8 commits
-
-
Tim Wickberg authored
No functional change.
-
Tim Wickberg authored
-
Tim Wickberg authored
Due to upcoming changes in the X11 forwarding subsystem, support for older-style X11 tunnels will be removed. Older client commands cannot support the newer style. Rather than have the tunnel fail, request the job allocation request up front. Bug 3647.
-
Tim Wickberg authored
Also tweak the one info() message here to match these others.
-
Tim Wickberg authored
New X11 forwarding code will only support forwarding back to salloc or an allocating srun command. Using this option within sbatch was always hit-or-miss. If the user submitting was disconnected from the alloc host for any reason their xauth credentials would likely fail even if they managed to get assigned the same local TCP port for forwarding. Bug 3647.
-
Tim Wickberg authored
-
Tim Wickberg authored
-
Tim Wickberg authored
-
- 08 Dec, 2018 2 commits
-
-
Morris Jette authored
-
Morris Jette authored
-
- 07 Dec, 2018 17 commits
-
-
Danny Auble authored
< PAM_MAX_MSG_SIZE (which as of this date is 512)
-
Danny Auble authored
-
Matthias Gerstner authored
In some systems there can be multiple user accounts for uid 0, therefore the check for literal user name "root" might be insufficient. Bug 6184
-
Matthias Gerstner authored
Using memcpy, an amount of undefined data from the stack will be copied into the target buffer. While pam_conv probably doesn't evalute the extra data it still unclean to do that. It could lead up to an information leak somewhen.
-
Matthias Gerstner authored
This pam module is tailored towards running in the context of remote ssh logins. When running in a different context like a local sudo call then the module could be influenced by e.g. passing environment variables like SLURM_CONF. By limiting the module to only perform its actions when running in the sshd context by default this situation can be avoided. An additional pam module argument service=<service> allows an Administrator to control this behavior, if different behavior is explicitly desired. Bug 6184
-
Morris Jette authored
-
Morris Jette authored
Modify the functions in gres/mps to match changes implemented in commit 5d40d5dc863446ff6
-
Michael Hinton authored
Add step_configure_hardware() to GRES plugin API. Call step_configure_hardware() from stepd and make sure it's root. Sort frequencies returned by NVML. Update GRES-related docs. Document step_configure_hardware() and get_devices(). Test the bubble sort algorithm. Update test 39.9 to work with new code. Remove vestigial references to volts in the docs.
-
Tim Wickberg authored
-
Nate Rini authored
Only print a warning for 18.08. If a user has SLURM_MEM_PER_CPU or SLURM_MEM_PER_NODE environment variables set for some reason this situation could be happening by accident, and we don't want to prevent the srun command from launching steps at this point. Bug 6058.
-
Broderick Gardner authored
Bug 5648.
-
Tim Wickberg authored
-
Tim Wickberg authored
-
Tim Wickberg authored
Remove restriction on port count previously added in 620f92f7.
-
Morris Jette authored
gres_plugin_job_state_validate() modified in commit a44297e5 to add tres_freq field. This changes the test to match.
-
Morris Jette authored
-
Morris Jette authored
Disable request for both gres/mps and gres/gpu in a single request. Also disable request for gres/mps and --gpu-freq, disable mps-per-job with more than 1 node, disable mps-per-socket with more than 1 socket, and disable mps-per-task with more than 1 node. Note these mps-per-* options are not currently available, but data structures do exist to implement them.
-
- 06 Dec, 2018 11 commits
-
-
Janne Blomqvist authored
The Linux kernel default hard limit of 4096 for the number of file descriptors is quite small. Debian/Ubuntu have for a long time overridden this, increasing it to 1M. Recently systemd also bumped the default to 512k. https://github.com/systemd/systemd/blob/master/NEWS https://github.com/systemd/systemd/pull/10244 https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/thread/ZN5TK3D6L7SE46KGXICUKLKPX2LQISVX/ https://github.com/systemd/systemd/commit/09dad04c49cae3ad2b319c9b4e7773fedd34309a Here the limits are increased as follows: - slurmd: 128k; some workloads like Hadoop/Spark need a lot of fd's, and recommend that the limit is increased to at least 64k. - slurmctld: 64k; per the Slurm high throughput and big system guides which recommend a file-max of at least 32k. - slurmdbd: 64k, matching slurmctld, though slurmdbd shouldn't need that many fd's, bumping the limit shouldn't hurt either. Bug 6171
-
Chris Rorvick authored
This makes it easier to pass string literals to slurm_perror(3). For example, a recent GCC emits the following when a string literal is passed in C++: ISO C++ forbids converting a string constant to 'char*' Bug 6082
-
Tim Wickberg authored
Bug 5248
-
Mike Nolta authored
Bug 6055
-
Mike Nolta authored
Add the following slurmctld return codes to the lua plugin: ESLURM_ACCESS_DENIED ESLURM_ACCOUNTING_POLICY ESLURM_INVALID_NODE_COUNT ESLURM_JOB_MISSING_SIZE_SPECIFICATION ESLURM_MISSING_TIME_LIMIT Bug 6055
-
Tim Wickberg authored
Rework one timer error message while here. Bug 5861.
-
Danny Auble authored
Bug 6013
-
Morris Jette authored
The GRES node information in slurmctld is now as we want it to be. No job/step scheduling logic implemented yet.
-
Morris Jette authored
Previous logic was printing a uint64_t as %lld and the test was looking for a negative number. This corrects the test.
-
Morris Jette authored
Test was sometimes failing due to delay in getting job accounting data to database.
-
Tim Wickberg authored
-