1. 26 Sep, 2011 2 commits
  2. 19 Sep, 2011 1 commit
  3. 17 Sep, 2011 1 commit
  4. 16 Sep, 2011 2 commits
    • Morris Jette's avatar
      Problem using salloc/mpirun with task affinity socket binding · 98b203d4
      Morris Jette authored
      salloc/mpirun does not play well together with task affinity socket binding.  The following example illustrates the problem.
      
      [sulu] (slurm) mnp> salloc -p bones-only -N1-1 -n3 --cpu_bind=socket mpirun cat /proc/self/status | grep Cpus_allowed_list
      salloc: Granted job allocation 387
      --------------------------------------------------------------------------
      An invalid physical processor id was returned ...
      
      The problem is that with mpirun jobs Slurm launches only a single task, regardless of the value of -n. This confuses the socket binding logic in task affinity.  The result is that task affinity binds the task to only a single cpu, instead of all the allocated cpus on the socket.  When mpi attempts to bind to any of the other allocated cpus on the socket, it gets the "invalid physical processor id" error. Note that the problem may occur even if socket binding is not explicitly requested by the user.  If task/affinity is configured and the allocated CPUs are a whole number of sockets, Slurm will use "implicit auto binding" to sockets, triggering the problem.
      Patch from Martin Perry (Bull).
      98b203d4
    • Morris Jette's avatar
      Describe mechanism to reserve CPUs rather than whole nodes · 7e181113
      Morris Jette authored
      Update reservation web page to describe mechanism to reserve CPUs rather than whole nodes and provide an example.
      7e181113
  5. 15 Sep, 2011 3 commits
  6. 14 Sep, 2011 3 commits
  7. 13 Sep, 2011 1 commit
  8. 12 Sep, 2011 16 commits
  9. 10 Sep, 2011 4 commits
  10. 09 Sep, 2011 3 commits
  11. 08 Sep, 2011 4 commits
    • Morris Jette's avatar
      Prevent resetting schedloglevel if no logfile defined · 61989624
      Morris Jette authored
      If there is no SchedLogfile defined and 'scontrol schedloglevel 1'
      is issued from an administrator, slurmctld will segfault at the
      next "sched: " log message due to NULL log file pointer. There
      are obviously multiple ways to fix this issue, but in this patch
      the RPC simply returns and "Operation Disabled" error immediately
      if the sched log file is NULL.
      
      Other options include opening a new logfile with a default name,
      sending sched log messages to stderr, or enhancing the scontrol
      interface to allow specifying a logfile name for the schedlog.
      
      There are other cases in the schedlog code that could cause problems
      for the slurmctld, but since the sched log stuff is tied in strangely
      with the rest of the logging code, I didn't want to try modifying
      anything in log.c, for fear of breaking the normal logging functions.
      Patch from Mark Grondona, LLNL.
      61989624
    • Morris Jette's avatar
      Add "State" field to reservation information · 05b59105
      Morris Jette authored
      Add State=ACTIVE or State=INACTIVE to "scontrol show reservation" output.
      Patch from Phil Eckert, LLNL.
      05b59105
    • Morris Jette's avatar
      Correct formatting problem in faq web page · 728fee4b
      Morris Jette authored
      728fee4b
    • Danny Auble's avatar