1. 16 May, 2016 3 commits
  2. 13 May, 2016 3 commits
    • Morris Jette's avatar
      Update NEWS for start of v16.05.0rc3 · df97e108
      Morris Jette authored
      df97e108
    • Danny Auble's avatar
      Fix race condition with respects to cleaning up the profiling threads · b1fbeb85
      Danny Auble authored
      when in use.
      
      The problem here is the polling threads in the various acct_gather codes
      were detached and could possibly still be polling after the plugin had
      been unloaded making a seg fault with a backtrace like this...
      
      #0  0x00007fe7af008c00 in ?? ()
      #1  0x00007fe7b1138479 in __nptl_deallocate_tsd () at pthread_create.c:175
      #2  0x00007fe7b11398b0 in __nptl_deallocate_tsd () at pthread_create.c:326
      #3  start_thread (arg=0x7fe7b1f12700) at pthread_create.c:346
      #4  0x00007fe7b0e6fb5d in clone ()
          at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
      
      The fix was to make the threads non-detached and join them before calling
      a dlclose.
      b1fbeb85
    • Morris Jette's avatar
      Avoid nodes requiring reboot · 139102f0
      Morris Jette authored
      Whenever possible, avoid allocating nodes that require a reboot.
      Previous logic failed to re-sort the job set table based upon
      the need for rebooting to achieve the desired features (e.g. KNL
      MCDRAM or CACHE mode).
      bug 2726
      139102f0
  3. 12 May, 2016 3 commits
  4. 11 May, 2016 4 commits
  5. 10 May, 2016 5 commits
  6. 09 May, 2016 2 commits
  7. 06 May, 2016 3 commits
    • Morris Jette's avatar
      Automatically all "hbm" GRES for KNL · 97831467
      Morris Jette authored
      If node_feature/knl_cray plugin is configured and a GresType of "hbm"
      is not defined, then add it the the GRES tables. Without this, references
      to a GRES of "hbm" (either by a user or Slurm's internal logic) will
      generate error messages.
      bug 2708
      97831467
    • John Thiltges's avatar
      Fix for slurmstepd setfault · db0fe22e
      John Thiltges authored
      With slurm-15.08.10, we're seeing occasional segfaults in slurmstepd. The logs point to the following line: slurm-15.08.10/src/slurmd/slurmstepd/mgr.c:2612
      
      On that line, _get_primary_group() is accessing the results of getpwnam_r():
          *gid = pwd0->pw_gid;
      
      If getpwnam_r() cannot find a matching password record, it will set the result (pwd0) to NULL, but still return 0. When the pointer is accessed, it will cause a segfault.
      
      Checking the result variable (pwd0) to determine success should fix the issue.
      db0fe22e
    • Marco Ehlert's avatar
      Correct partition MaxCPUsPerNode enforcement · 70aafa68
      Marco Ehlert authored
      I would like to mention a problem which seems to be a calculation bug of
      used_cores in slurm version 15.08.7
      
      If a node is divided into 2 partitions using MaxCPUsPerNode like this
      slurm.conf configuration
      
          NodeName=n1 CPUs=20
          PartitionName=cpu NodeName=n1    MaxCPUsPerNode=16
          PartitionName=gpu NodeName=n1    MaxCPUsPerNode=4
      
      I run into a strange scheduling situation.
      The situation occurs after a fresh restart of the slurmctld daemon.
      
      I start jobs one by one:
      
      case 1
          systemctl restart slurmctld.service
          sbatch -n 16 -p cpu cpu.sh
          sbatch -n 1  -p gpu gpu.sh
          sbatch -n 1  -p gpu gpu.sh
          sbatch -n 1  -p gpu gpu.sh
          sbatch -n 1  -p gpu gpu.sh
      
          => Problem now: The gpu jobs are kept in PENDING state.
      
      This picture changes if I start the jobs this way
      
      case 2
          systemctl restart slurmctld.service
          sbatch -n 1  -p gpu gpu.sh
          scancel <gpu job_id>
          sbatch -n 16 -p cpu cpu.sh
          sbatch -n 1  -p gpu gpu.sh
          sbatch -n 1  -p gpu gpu.sh
          sbatch -n 1  -p gpu gpu.sh
          sbatch -n 1  -p gpu gpu.sh
      
      and all jobs are running fine.
      
      By looking into the code I figured out a wrong calculation of 'used_cores' in
      function _allocate_sc()
      
      plugins/select/cons_res/job_test.c
      
      _allocate_sc(...)
      ...
               for (c = core_begin; c < core_end; c++) {
                       i = (uint16_t) (c - core_begin) / cores_per_socket;
      
                       if (bit_test(core_map, c)) {
                               free_cores[i]++;
                               free_core_count++;
                       } else {
                               used_cores[i]++;
                       }
                       if (part_core_map && bit_test(part_core_map, c))
                               used_cpu_array[i]++;
      
      This part of code seems to work only if the part_core_map exists for a
      partition or on a completly free node. But in case 1 there is no
      part_core_map for gpu created yet. Starting a gpu  the core_map contains
      4 cores left from the cpu job. Now all non free cores of the cpu partion
      are counted as used cores in the gpu partition and this condition will
      match in the next code parts
      
          free_cpu_count + used_cpu_count >  job_ptr->part_ptr->max_cpus_per_node
      
      what is definitely wrong.
      
      As soon as a part_core_map appears, means a gpu job was started on a free
      node (case 2) then there is no problem at all.
      
      To get case 1 work I changed the above code to the following and all works
      fine:
      
               for (c = core_begin; c < core_end; c++) {
                       i = (uint16_t) (c - core_begin) / cores_per_socket;
      
                      if (bit_test(core_map, c)) {
                               free_cores[i]++;
                               free_core_count++;
                       } else {
                           if (part_core_map && bit_test(part_core_map, c)){
                               used_cpu_array[i]++;
                               used_cores[i]++;
                           }
                       }
      
      I am not sure this code change is really good, but it fixes my problem.
      70aafa68
  8. 05 May, 2016 3 commits
  9. 04 May, 2016 3 commits
  10. 03 May, 2016 5 commits
  11. 29 Apr, 2016 6 commits