- 04 Feb, 2012 2 commits
-
-
Morris Jette authored
Fix for srun allocating running within existing allocation with --exclude option and --nnodes count small enough to remove more nodes. > salloc -N 8 salloc: Granted job allocation 1000008 > srun -N 2 -n 2 --exclude=tux3 hostname srun: error: Unable to create job step: Requested node configuration is not available Patch from Phil Eckert, LLNL.
-
Morris Jette authored
Add call to mpi_hook_slurmstepd_prefork() from slurmstep immediately prior to fork/exec of user tasks. Patch from Hongjia Cao, NUDT.
-
- 03 Feb, 2012 10 commits
-
-
Morris Jette authored
Add a generic data forwarding protocol to slurmd which uses its existing hierarchical communications protocol. Patch from Hongjia Cao, NUDT.
-
Morris Jette authored
Patch from Hongjia Cao, NUDT.
-
Morris Jette authored
Patch from Hongjia Cao, NUDT.
-
Morris Jette authored
Make step launch API more robust if there are NULL pointers. Patch from Hongjia Cao, NUDT.
-
Morris Jette authored
Avoid failure when common I/O function is passed a NULL pointer. Patch from Hongjia Cao, NUDT.
-
Morris Jette authored
Modify some I/O handler functions to return when passed a NULL I/O pointer rather than aborting. Patch from Hongjia Cao, NUDT.
-
Morris Jette authored
In srun --multi-prog mode, treat file name that begins with "." as an absolute pathname with no need to use PATH. Patch by Hongjia Cao, NUDT.
-
Morris Jette authored
-
Morris Jette authored
Add simplified version of configurator too. Work by Nathan Yee, SchedMD
-
Morris Jette authored
-
- 02 Feb, 2012 9 commits
-
-
Danny Auble authored
-
Danny Auble authored
-
Morris Jette authored
Fix bug in step task distribution when nodes are not configured in numeric order. Patch from Hongjia Cao, NUDT.
-
Morris Jette authored
-
Danny Auble authored
-
Morris Jette authored
Add logic to cache GPU file information (bitmap index mapping to device file number) in the slurmd daemon and transfer that information to the slurmstepd whenever a job step is initiated. This is needed to set the appropriate CUDA_VISIBLE_DEVICES environment variable value when the devices are not in strict numeric order (e.g. some GPUs are skipped). Based upon work by Nicolas Bigaouette.
-
Morris Jette authored
-
Morris Jette authored
Based upon work by Nicolas Bigaouette
-
Morris Jette authored
Based upon work by Nicolas Bigaouette
-
- 01 Feb, 2012 10 commits
-
-
Morris Jette authored
-
Morris Jette authored
Conflicts: src/slurmctld/node_scheduler.c
-
Morris Jette authored
-
Danny Auble authored
-
Morris Jette authored
-
Morris Jette authored
Fix bug when requeued batch job is scheduled to run on a different node zero, but attemts job launch on old node zero causing fatal error "Invalid host_index -1 for job #"
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
Avoid slurmctld abort due to bad pointer when setting an advanced reservation MAINT flag if it contains no nodes (only licenses).
-
- 31 Jan, 2012 9 commits
-
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
blocks are in an error state.
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-