1. 11 Oct, 2011 2 commits
    • Mark A. Grondona's avatar
      slurmstepd: Add abstraction for fork-and-wait · e124e872
      Mark A. Grondona authored
      Abstract the code in slurmstepd fork_all_tasks that allows the parent
      to signal children before they call exec into an "exec_wait_info"
      interface. This will allow the code to be easily reused in other
      parts of slurmstepd (e.g. task epilog) without cut-and-paste of code.
      e124e872
    • jette's avatar
      Fix job hold type problem · 272e3390
      jette authored
      Prevent job hold by operator or account coordinator of his own job from
      being an Administrator Hold rather than User Hold by default.
      272e3390
  2. 07 Oct, 2011 1 commit
  3. 05 Oct, 2011 2 commits
  4. 04 Oct, 2011 3 commits
  5. 03 Oct, 2011 1 commit
  6. 30 Sep, 2011 4 commits
  7. 29 Sep, 2011 6 commits
  8. 28 Sep, 2011 4 commits
  9. 27 Sep, 2011 1 commit
    • Mark A. Grondona's avatar
      Allow job owner to use scontrol notify · 141d87a4
      Mark A. Grondona authored
      The slurmctld code that processes job notify messages unecessarily
      restricts these messages to be from the slurm user or root. This
      patch allows users to send notifications to their own jobs.
      141d87a4
  10. 26 Sep, 2011 4 commits
  11. 19 Sep, 2011 1 commit
  12. 17 Sep, 2011 1 commit
  13. 16 Sep, 2011 2 commits
    • Morris Jette's avatar
      Problem using salloc/mpirun with task affinity socket binding · 98b203d4
      Morris Jette authored
      salloc/mpirun does not play well together with task affinity socket binding.  The following example illustrates the problem.
      
      [sulu] (slurm) mnp> salloc -p bones-only -N1-1 -n3 --cpu_bind=socket mpirun cat /proc/self/status | grep Cpus_allowed_list
      salloc: Granted job allocation 387
      --------------------------------------------------------------------------
      An invalid physical processor id was returned ...
      
      The problem is that with mpirun jobs Slurm launches only a single task, regardless of the value of -n. This confuses the socket binding logic in task affinity.  The result is that task affinity binds the task to only a single cpu, instead of all the allocated cpus on the socket.  When mpi attempts to bind to any of the other allocated cpus on the socket, it gets the "invalid physical processor id" error. Note that the problem may occur even if socket binding is not explicitly requested by the user.  If task/affinity is configured and the allocated CPUs are a whole number of sockets, Slurm will use "implicit auto binding" to sockets, triggering the problem.
      Patch from Martin Perry (Bull).
      98b203d4
    • Morris Jette's avatar
      Describe mechanism to reserve CPUs rather than whole nodes · 7e181113
      Morris Jette authored
      Update reservation web page to describe mechanism to reserve CPUs rather than whole nodes and provide an example.
      7e181113
  14. 15 Sep, 2011 3 commits
  15. 14 Sep, 2011 3 commits
  16. 13 Sep, 2011 1 commit
  17. 12 Sep, 2011 1 commit