1. 13 Apr, 2015 1 commit
    • Morris Jette's avatar
      Eliminate "Node ping apparently hung" errors · 101570a9
      Morris Jette authored
      The error was being triggered by logic to collect accounting
      information starting and not being completed when testing to
      start a node ping RPC. In other words, the logic wasn't testing
      for the completion of one RPC against the start of that same
      RPC, but against the start of a different RPC. This change
      moves all of the timeout logic into the ping_nodes.c module
      where we can make sure that the timing of different RPCs do
      not get confused with each other.
      bug 1190
      101570a9
  2. 11 Apr, 2015 3 commits
  3. 09 Apr, 2015 2 commits
    • Danny Auble's avatar
      CRAY - Throttle the post NHC operations as to not hog the job write lock · c5801a12
      Danny Auble authored
      if many steps/jobs finish at once.
      c5801a12
    • Morris Jette's avatar
      Add support for specialized threads · 709f6504
      Morris Jette authored
      * Add "--thread-spec" option to salloc, sbatch and srun commands. This is
        the count of threads reserved for system use per node.
      * Add ability for scontrol to get/set job ThreadSpec
      * sivew: Add job ThreadSpec field to get/set
      * Modify select/cons_res to manage job allocations while leaving specialized
        threads for system use.
      * core_spec plugins: Fix system task binding logic and logging
      * Modify squeue output for thread_spec values
      * task/affinity and cgroup: Enhanced task binding logic
      709f6504
  4. 08 Apr, 2015 1 commit
  5. 07 Apr, 2015 8 commits
  6. 06 Apr, 2015 3 commits
  7. 03 Apr, 2015 1 commit
  8. 02 Apr, 2015 3 commits
  9. 01 Apr, 2015 6 commits
  10. 31 Mar, 2015 4 commits
    • Morris Jette's avatar
      SPANK env var name change · 224cd329
      Morris Jette authored
      SPANK naming changes: For environment variables set using the
          spank_job_control_setenv() function, the values were available in the
          slurm_spank_job_prolog() and slurm_spank_job_epilog() functions using
          getenv where the name was given a prefix of "SPANK_". That prefix has
          been removed for consistency with the environment variables available in
          the Prolog and Epilog scripts.
      bug 1570
      224cd329
    • Morris Jette's avatar
      Error on user changing job priority · 76706b51
      Morris Jette authored
      Requests by normal user to reset a job priority (even to lower it) will
      result in an error saying to change the job's nice value instead.
      76706b51
    • Morris Jette's avatar
      Limit user change of job priority · 4454316e
      Morris Jette authored
      A non-administrator change to job priority will not be persistent except
      for holding the job. User's wanting to change a job priority on a persistent
      basis should reset it's "nice" value.
      Without this change a user might lower a job's priority, but Slurm will
      not lower it more based upon fair-share changes.
      4454316e
    • Morris Jette's avatar
      Increase MAX_PACK_MEM_LEN to 1GB · 064343c2
      Morris Jette authored
      Increase the MAX_PACK_MEM_LEN define to avoid PMI2 failure when fencing
      with large amount of ranks (to 1GB).
      064343c2
  11. 30 Mar, 2015 1 commit
  12. 27 Mar, 2015 3 commits
    • Morris Jette's avatar
      Verify plugin versions at load time · 45f671a8
      Morris Jette authored
      Verify that all plugin version numbers are identical to the component
      attempting to load them. Without this verification, the plugin can reference
      Slurm functions in the caller which differ (e.g. the underlying function's
      arguments could have changed between Slurm versions).
      NOTE: All plugins (except SPANK) must be built against the identical
      version of Slurm in order to be used by any Slurm command or daemon. This
      should eliminate some very difficult to diagnose problems due to use of old
      plugins.
      45f671a8
    • Brian Christiansen's avatar
      Change list_for_each priority functions to always return SUCCESS. · b2be6159
      Brian Christiansen authored
      Bug 1469
      Return values from void functions are unknown and were causing
      list_for_each to short ciruit processing of the job list.
      b2be6159
    • David Bigagli's avatar
      Increase the size of MinJobAge. · 711d815e
      David Bigagli authored
      711d815e
  13. 26 Mar, 2015 2 commits
  14. 25 Mar, 2015 1 commit
    • Morris Jette's avatar
      burst_buffer/cray - Updated APIs · 57a3aefb
      Morris Jette authored
      Modify the command names per latest Cray documentation.
      Add Cray API calls, which will replace the Cray command line
        interfaces when available. More work will be required for
        this conversion to the APIs.
      Update documentation.
      57a3aefb
  15. 23 Mar, 2015 1 commit