1. 24 Mar, 2017 1 commit
  2. 22 Mar, 2017 1 commit
  3. 21 Mar, 2017 8 commits
  4. 17 Mar, 2017 3 commits
  5. 16 Mar, 2017 5 commits
  6. 15 Mar, 2017 1 commit
  7. 14 Mar, 2017 1 commit
    • Danny Auble's avatar
      Fix slurmdbd_defs.c to not have half symbols go to libslurm.so and the · 7eef69e3
      Danny Auble authored
      other half go to libslurmdb.so.
      
      Turns out in 17.11 if you are using --with-shared-libslurm sacct will link to
      both libslurmfull.so and libslurmdb.so.  In the case of linking to the
      accounting_storage/slurmdbd plugin it will callback to slurmdbd_defs_init
      which would be in libslurmdb.so but the call to acct_storage_p_get_connection
      would call slurm_open_slurmdbd_conn which is in libslurmfull.so and it would
      xassert on slurmdbd_defs_inited which was set in libslurmdb.so but not in
      libslurmfull.so.
      
      So the moral of the story is don't export half a file in one lib and the other
      half in a different lib.
      
      Perhaps we should only have 1 lib all together, but that isn't the way it is
      done today.  This fixes the issue and has the entire file exported to
      libslurmfull.so so we should be good.  But just a note for the future.
      
      This was an unexpected regression caused by commit 5a5347c7.
      7eef69e3
  8. 13 Mar, 2017 2 commits
    • Alejandro Sanchez's avatar
      Consider QOS flags Partition[Min|Max]Nodes when doing backfill. · bec4c516
      Alejandro Sanchez authored
      Bug 3530
      
      NOTE: there does appear to be other issues here, but we didn't fell comfortable
      changing this many things in 17.02 for fear of breaking something.
      
      This only fixes a bit of the issue as it appears node_scheduler.c has a fuller
      test for these.  In 17.11 I plan to make this a function that will fill in
      the min, req, max nodes and use it in both places in the code to prevent this
      from happening again.
      bec4c516
    • Alejandro Sanchez's avatar
      Fix regression in b894280a. · 07ce0773
      Alejandro Sanchez authored
      Code calls list_find_first to search in resv_list whether the requested
      name for the new reservation already exists. If it exists, resv_ptr is
      set with the pointer to the existing reservation. Then the code goto
      bad_parse label and xfreed that resv_ptr, thus corrupting the list data
      by freeing the existing reservation. This is fixed by only freeing memory on the
      new local resv_ptr instead of always freeing memory.  xfree is also not
      sufficient for freeing the memory, we needed to call _del_resv_rec() or we would
      leak the memory we had transferred from the resv_desc_ptr.  This also involved
      NULLing out the other variables freed after bad_parse, or you would get
      double frees.
      
      Bug 3558.
      07ce0773
  9. 11 Mar, 2017 1 commit
  10. 10 Mar, 2017 5 commits
  11. 08 Mar, 2017 10 commits
  12. 07 Mar, 2017 2 commits
    • Morris Jette's avatar
      capmc_resume not changing node state · cecaf222
      Morris Jette authored
      capmc_resume (Cray resume node script) - Do not disable changing a node's
          active features if SyscfgPath is configured in the knl.conf file.
      bug 3533
      cecaf222
    • Morris Jette's avatar
      Fix for job cancel when node reconfig fails · 2e21d9f7
      Morris Jette authored
      If a job is cancelled by the user while it's allocated nodes are being
          reconfigured (i.e. the capmc_resume program is rebooting nodes for the job)
          and the node reconfiguration fails (i.e. the reboot fails), then don't
          requeue the job but leave it in a cancelled state. Note the JOB_RECONFIG_FAIL
          state flag is currently only used by capmc_resume, but could be used for
          other programs responsible for node reboots.
      bug 3392
      2e21d9f7