- 20 Feb, 2019 5 commits
-
-
Morris Jette authored
-
Tim Wickberg authored
-
Tim Wickberg authored
-
Tim Wickberg authored
If DISPLAY is a local UNIX socket, return 0 for the port number, and an xmalloc()'d string as the target option.
-
Tim Wickberg authored
Rename x11_target_host to x11_alloc_host to better indicate what the value represents. Going forward, the x11_alloc_host and x11_alloc_port are the hostname and TCP port number to connect to get the tunnel established. The x11_target and x11_target_port fields indicate which X11 display to connect to. If x11_target_port is zero, this indicates that x11_target is a UNIX socket on x11_alloc_host. Otherwise, x11_target is the hostname associated with the TCP port in x11_target_port for the DISPLAY. Make careful changes to older protocol blocks to ensure the 17.11/18.08 slurmd processes can receive sufficient details from 19.05 slurmctld to setup SSH-based forwarding.
-
- 19 Feb, 2019 3 commits
-
-
Tim Wickberg authored
-
Morris Jette authored
These tests previously assumed that one task could be launched per CPU, which is not necessarily the case
-
Morris Jette authored
If CR_ONE_TASK_PER_CORE is configured then the core count rather than the CPU count of a node is used to determine if a node can be used by a job. This can result in a job being rejected than should be able to run. Sample configuration and job below: SelectTypeParameters=CR_Core_Memory,CR_CORE_DEFAULT_DIST_BLOCK,CR_ONE_TASK_PER_CORE NodeName=psg-dgx2-01 NodeAddr=jette NodeHostName=jette RealMemory=1536000 Gres=gpu:16 Sockets=2 CoresPerSocket=24 ThreadsPerCore=2 State=UNKNOWN $ srun --gpus-per-task=1 -n1 --cpus-per-gpu=64 -J test39.7 -t1 ./test39.7.input srun: error: CPU count per node can not be satisfied srun: error: Unable to allocate resources: Requested node configuration is not available bug 6517
-
- 16 Feb, 2019 1 commit
-
-
Morris Jette authored
This should never happen, but might be recoverable. Log the event, pack zeros, and continue. bug 6534
-
- 15 Feb, 2019 3 commits
-
-
Morris Jette authored
Need to modify other uses of adjust_cpus_nppcu()
-
Tim Wickberg authored
-
Tim Wickberg authored
-
- 14 Feb, 2019 17 commits
-
-
Morris Jette authored
-
Morris Jette authored
Recognize when a job either did or did not explicitly specify the --cpus-per-task option so that we can allocate more CPUs as appropriate to satisfy --cpus-per-gpu and related option.
-
Morris Jette authored
If a job submit does NOT include --cpus-per-task option, then report the value as "N/A" rather than always mapping the value to 1.
-
Morris Jette authored
No changes to data structure yet, just adding different un/pack test for v19.05.
-
Alejandro Sanchez authored
No functional change. Bug 6210 and 6262.
-
Alejandro Sanchez authored
No functional change. Bug 6210 and 6262.
-
Alejandro Sanchez authored
Bug 6210 and 6262.
-
Danny Auble authored
# Conflicts: # doc/man/man1/salloc.1 # src/salloc/opt.c
-
Alejandro Sanchez authored
Previously some samples/sizes were reported as total accumulated values instead of deltas or were reported after an underflow occurred. Bug 6210 and 6262.
-
Nate Rini authored
Bug 6278
-
Danny Auble authored
Bug 6278
-
Danny Auble authored
# Conflicts: # doc/man/man5/slurm.conf.5
-
Danny Auble authored
-
Danny Auble authored
This is helpful when running multiple versions against the same build. i.e. globals.snowflake globals.knc ...
-
Nathan Rini authored
Not all Linux systems have a time binary, use bash instead since it has time built in and is already required for test units. Bug 6503.
-
Morris Jette authored
Improve description of the Core/CPU index values.
-
Morris Jette authored
If CR_ONE_TASK_PER_CORE is configured then the core count rather than the CPU count of a node is used to determine if a node can be used by a job. This can result in a job being rejected than should be able to run. Sample configuration and job below: SelectTypeParameters=CR_Core_Memory,CR_CORE_DEFAULT_DIST_BLOCK,CR_ONE_TASK_PER_CORE NodeName=psg-dgx2-01 NodeAddr=jette NodeHostName=jette RealMemory=1536000 Gres=gpu:16 Sockets=2 CoresPerSocket=24 ThreadsPerCore=2 State=UNKNOWN $ srun --gpus-per-task=1 -n1 --cpus-per-gpu=64 -J test39.7 -t1 ./test39.7.input srun: error: CPU count per node can not be satisfied srun: error: Unable to allocate resources: Requested node configuration is not available
-
- 13 Feb, 2019 10 commits
-
-
Morris Jette authored
Without this patch, test39.7 would cause _gen_combs() in src/plugins/select/cons_tres/dist_tasks.c would abort due to a NULL board_combs argument, which was due to ncomb_brd being zero. This problem was due to some other inssue in cons_tres currently under investigation, but this at least prevents the abort. Relevent configuration information from slurm.conf: SelectType=select/cons_tres SelectTypeParameters=CR_Core_Memory,CR_CORE_DEFAULT_DIST_BLOCK,CR_ONE_TASK_PER_CORE GresTypes=gpu NodeName=psg-dgx2-01 NodeAddr=jette NodeHostName=jette RealMemory=1536000 Gres=gpu:16 Sockets=2 CoresPerSocket=24 ThreadsPerCore=2 State=UNKNOWN PartitionName=debug Nodes=ALL Default=YES MaxTime=INFINITE State=UP gres.conf (CPUs parameters are recognized as bad here): NodeName=psg-dgx2-01 Name=gpu File=/dev/tty0 CPUs=0-23,48-71 NodeName=psg-dgx2-01 Name=gpu File=/dev/tty1 CPUs=0-23,48-71 NodeName=psg-dgx2-01 Name=gpu File=/dev/tty2 CPUs=0-23,48-71 NodeName=psg-dgx2-01 Name=gpu File=/dev/tty3 CPUs=0-23,48-71 NodeName=psg-dgx2-01 Name=gpu File=/dev/tty4 CPUs=0-23,48-71 NodeName=psg-dgx2-01 Name=gpu File=/dev/tty5 CPUs=0-23,48-71 NodeName=psg-dgx2-01 Name=gpu File=/dev/tty6 CPUs=0-23,48-71 NodeName=psg-dgx2-01 Name=gpu File=/dev/tty7 CPUs=0-23,48-71 NodeName=psg-dgx2-01 Name=gpu File=/dev/tty8 CPUs=24-47,72-95 NodeName=psg-dgx2-01 Name=gpu File=/dev/tty9 CPUs=24-47,72-95 NodeName=psg-dgx2-01 Name=gpu File=/dev/tty10 CPUs=24-47,72-95 NodeName=psg-dgx2-01 Name=gpu File=/dev/tty11 CPUs=24-47,72-95 NodeName=psg-dgx2-01 Name=gpu File=/dev/tty12 CPUs=24-47,72-95 NodeName=psg-dgx2-01 Name=gpu File=/dev/tty13 CPUs=24-47,72-95 NodeName=psg-dgx2-01 Name=gpu File=/dev/tty14 CPUs=24-47,72-95 NodeName=psg-dgx2-01 Name=gpu File=/dev/tty15 CPUs=24-47,72-95
-
Morris Jette authored
Correct format of some comments Combine text of log message onto one line so it can be search for
-
Jason Booth authored
Continuation of 37951110 Bug 6496
-
Nathan Rini authored
Bug 6488.
-
Michael Hinton authored
Bug 6479
-
Michael Hinton authored
Bug 6479
-
Ben Roberts authored
Updated accounting.shtml, sched_config.shtml and topology.shtml, fixing typos found in those files. Bug 6482
-
Alejandro Sanchez authored
Bug 6485.
-
Felip Moll authored
-
Morris Jette authored
Previous logic would sort by name using xstrcmp(). The new logic extracts the numeric suffix and sorts based upon that number. The difference is that the old algorithm would put "/dev/nvidia10" before "/dev/nvidia2". The new logic would put "/dev/nvidia10" after "/dev/nvidia2" and "/dev/nvidia9".
-
- 12 Feb, 2019 1 commit
-
-
Tim Wickberg authored
-