merge second set of srun patches from mark grondona, plus fix several bugs
found in testing that new code: - avoid draining node if stdout/err file can't be created - correction in logic for >1 thread per core in setting a job's SLURM_CPUS_ON_NODE env var - Fix for uneven task distributions on heterogeneous clusters (e.g. 4 CPUs on the first node and 2 CPUs on the second, a task_count=2 resulted in cpus_per_task=3, and previously allocated both tasks to the first node, now does one task per node)
Please register or sign in to comment