1. 18 Feb, 2016 1 commit
    • Tim Wickberg's avatar
      When determine if a job can fit into a TRES time limit after resources · 244b34e0
      Tim Wickberg authored
      have been selected set the time limit appropriately if the job didn't
      request one.
      
      If the partition has no DefaultTime setting, and no time_limit was given
      for the job, job_ptr->time_limit == NO_VAL.
      
      With AccountingStorageEnforce=safe this will prevent jobs from ever
      starting if the association has any limit set for CPUMins.
      (NO_VAL * cpus is a very large number, but if no time_limit is given
      anywhere that is what they get :))
      
      Bug 2388.
      244b34e0
  2. 17 Feb, 2016 2 commits
  3. 16 Feb, 2016 2 commits
  4. 12 Feb, 2016 1 commit
  5. 10 Feb, 2016 3 commits
  6. 09 Feb, 2016 2 commits
  7. 08 Feb, 2016 1 commit
  8. 04 Feb, 2016 1 commit
  9. 03 Feb, 2016 2 commits
  10. 02 Feb, 2016 3 commits
    • Tim Wickberg's avatar
      Fix build for sh5util on ppc64 by replacing printf formatters · b717bf5c
      Tim Wickberg authored
      Use PRIu64 instead of %ld for uint64_t types (libsh5util_old),
      %zu instead of PRIu64 for size_t.
      b717bf5c
    • Tim Wickberg's avatar
      update NEWS to mention FreeBSD fixes · e9982fa4
      Tim Wickberg authored
      e9982fa4
    • Didier GAZEN's avatar
      Fix support for AuthInfo in slurmdbd.conf · fa4222ec
      Didier GAZEN authored
      Support AuthInfo in slurmdbd.conf that is different from the value in
          slurm.conf.
      There is a possible bug in the slurm_get_auth_info function (src/common/slurm_protocol_api.c) that can cause the slurmdbd daemon to look for the AuthInfo parameter in slurm.conf instead of slurmdbd.conf when the auth/munge authentication method is used (AuthType=auth/munge).
      
      Here is the slurmdbd log revealing the problem (debug5() printing were added in the sources) :
      
      slurmdbd: slurmdbd version 15.08.7 started
      slurmdbd: debug2: running rollup at Tue Feb 02 14:20:14 2016
      slurmdbd: debug5: in ../../../src/slurmdbd/slurmdbd.c, _send_slurmctld_register_req (line 690)
      slurmdbd: debug5: in ../../../src/common/slurm_protocol_api.c, slurm_send_node_msg (line 3601)
      slurmdbd: debug5: in ../../../../../src/plugins/auth/munge/auth_munge.c, slurm_auth_create (line 217)
      slurmdbd: debug5: in ../../../src/common/slurm_protocol_api.c, slurm_get_auth_ttl (line 1732)
      slurmdbd: debug5: Entering ../../../src/common/slurm_protocol_api.c, slurm_get_auth_info
      slurmdbd: debug:  Reading slurm.conf file: /usr/local/slurm-15-08-7-1/etc/slurm.conf
      slurmdbd: error: s_p_parse_file: unable to status file /usr/local/slurm-15-08-7-1/etc/slurm.conf: No such file or directory, retrying in 1sec up to 60sec
      ...
      
      Then 60 seconds later, the auth_info value returned by slurm_get_auth_info is NULL:
      
      slurmdbd: debug5: Leaving ../../../src/common/slurm_protocol_api.c, slurm_get_auth_info, auth_info=(null)
      
      and slurmdbd continues without crashing, but I am not sure it is in a safe state.
      
      When applying this patch :
      
      diff --git a/src/common/slurm_protocol_api.c b/src/common/slurm_protocol_api.c
      index c5db879..be1dab6 100644
      --- a/src/common/slurm_protocol_api.c
      +++ b/src/common/slurm_protocol_api.c
      @@ -1703,9 +1703,13 @@ extern char *slurm_get_auth_info(void)
              char *auth_info;
              slurm_ctl_conf_t *conf;
      
      -       conf = slurm_conf_lock();
      -       auth_info = xstrdup(conf->authinfo);
      -       slurm_conf_unlock();
      +       if (slurmdbd_conf) {
      +                auth_info = xstrdup(slurmdbd_conf->auth_info);
      +        } else {
      +               conf = slurm_conf_lock();
      +               auth_info = xstrdup(conf->authinfo);
      +               slurm_conf_unlock();
      +       }
      
              return auth_info;
       }
      
      the auth_info value is now valid and consistent with the slurmdbd.conf setting:
      
      slurmdbd: slurmdbd version 15.08.7 started
      slurmdbd: debug2: running rollup at Tue Feb 02 14:47:37 2016
      slurmdbd: debug5: in ../../../src/slurmdbd/slurmdbd.c, _send_slurmctld_register_req (line 690)
      slurmdbd: debug5: in ../../../src/common/slurm_protocol_api.c, slurm_send_node_msg (line 3600)
      slurmdbd: debug5: in ../../../../../src/plugins/auth/munge/auth_munge.c, slurm_auth_create (line 217)
      slurmdbd: debug5: in ../../../src/common/slurm_protocol_api.c, slurm_get_auth_ttl (line 1731)
      slurmdbd: debug5: Entering ../../../src/common/slurm_protocol_api.c, slurm_get_auth_info
      slurmdbd: debug5: Leaving ../../../src/common/slurm_protocol_api.c, slurm_get_auth_info, auth_info=socket=/var/run/munge/munge_dbd.socket.2
      fa4222ec
  11. 01 Feb, 2016 1 commit
  12. 29 Jan, 2016 1 commit
  13. 28 Jan, 2016 5 commits
    • Morris Jette's avatar
      Don't relocated multi-node core reservations · a801d264
      Morris Jette authored
      Do not automatically relocate an advanced reservation for individual cores
          that spans multiple nodes when nodes in that reservation go down (e.g.
          a 1 core reservation on node "tux1" will be moved if node "tux1" goes
          down, but a reservation containing 2 cores on node "tux1" and 3 cores on
          "tux2" will not be moved node "tux1" goes down). Advanced reservations for
          whole nodes will be moved by default for down nodes.
      bug 2326
      a801d264
    • Tim Wickberg's avatar
      srun - check that found file is not a directory · 15c4bcf1
      Tim Wickberg authored
      avoid attempting to execve() a directory with a name that
      happens to matching that of the desired command. bug 2392.
      15c4bcf1
    • Morris Jette's avatar
      Ignore a reserverations jobs when changing · b77666b5
      Morris Jette authored
      Allow an existing reservation with running jobs to be modified without
          Flags=IGNORE_JOBS.
      bug 2389
      b77666b5
    • Morris Jette's avatar
      burst_buffer/cray - avoid overflow · 214b3abe
      Morris Jette authored
      burst_buffer/cray - Increase size of intermediate variable used to store
          buffer byte size read from DW instance from 32 to 64-bits to avoid overflow
          and reporting invalid buffer sizes.
      bug 2378
      214b3abe
    • Danny Auble's avatar
      GRES - Fix minor typecast issues. · 6f94bb7f
      Danny Auble authored
      6f94bb7f
  14. 27 Jan, 2016 5 commits
  15. 25 Jan, 2016 2 commits
  16. 22 Jan, 2016 1 commit
  17. 21 Jan, 2016 7 commits