1. 21 Mar, 2012 17 commits
    • Mark A. Grondona's avatar
      spank: consolidate globals in plugstack.c · 2eb0b999
      Mark A. Grondona authored
      The use of globals in plugstack.c is cumbersome and prevents the
      future expansion of spank plugins, e.g. calling spank plugins from
      multiple contexts within the same process or reinitializing the
      spank plugin state.
      
      This patch consolidates the current globals (spank_stack, spank_ctx,
      spank_optval, and option_cache) into a global "spank stack" structure
      and expands many of the functions internal to plugstack.c to operate
      on a struct spank_stack instead of globally.
      2eb0b999
    • Mark A. Grondona's avatar
      spank: fix handling of remote spank_init_post_opt · 3a522459
      Mark A. Grondona authored
      There was likely a typo/thinko/patcho in the handling of the
      return code from _do_call_stack(SPANK_INIT_POST_OPT) in _spank_init
      in "remote" context. This error caused spank_init() to always
      succeed, since the test less than zero would always return 0 or 1.
      3a522459
    • Mark A. Grondona's avatar
      spank: refuse to load the same plugin more than once · 7a60bf95
      Mark A. Grondona authored
      Avoid loading the same plugin more than once in plugstack.c.
      Most likely this will be a configuration error, so we should
      catch it early. If the same .so appears in the plugin stack
      more than once, it is likely to cause very strange errors,
      since dlopen() will only map the library a single time.
      7a60bf95
    • Morris Jette's avatar
      change owner of slurmctld and slurmdbd log files · 3470c651
      Morris Jette authored
      Change the owner of slurmctld and slurmdbd log files to the appropriate
      user. Without this change the files will be created by and owned by the
      user starting the daemons (likely user root).
      3470c651
    • Morris Jette's avatar
      Merge branch 'slurm-2.3' · e78802d3
      Morris Jette authored
      e78802d3
    • Morris Jette's avatar
      CRAY: Fix support for SlurmdTimeout=0 · 4dd9e697
      Morris Jette authored
      CRAY: Fix support for configuration with SlurmdTimeout=0 (never mark
      node that is DOWN in ALPS as DOWN in SLURM).
      4dd9e697
    • Morris Jette's avatar
      Add delay to test for job info propagation · 7636f0f2
      Morris Jette authored
      7636f0f2
    • Morris Jette's avatar
      Modify the step completion RPC between slurmd and slurmstepd · ed31e6c7
      Morris Jette authored
      in the tightly coupled functions slurmd:stepd_completion and
      slurmstepd:_handle_completion, a jobacct structure is
      send from the main daemon to the step daemon to provide
      the statistics of the children slurmstepd and do the aggregation.
      
      The methodology used to send the structure is the use of
      jobacct_gather_g_{setinfo,getinfo} over a pipe (JOBACCT_DATA_PIPE).
      As {setinfo,getinfo} use a common internal lock and reading
      or writing to a pipe is equivalent to holding a lock, slurmd and
      slurmstepd have to avoid using both setinfo and getinfo over a
      pipe or deadlock situations can occured. For example :
      slurmd(lockforread,write)/slurmstepd(write,lockforread).
      
      This patch remove the call to jobacct_gather_g_setinfo in slurmd
      and the call to jobacct_gather_g_getinfo in slurmstepd ensuring
      that slurmd only do getinfo operations over a pipe and slurmstepd
      only do setinfo over a pipe. Instead jobacct_gather_g_{pack,unpack}
      are used to marshall/unmarshall the data for transmission over the
      pipe.
      Patch by Matthieu Hautreux, CEA.
      
      The patch committed here is a variation on the work by Matthieu.
      Specifically, the logic is added to slurmstepd to read a new format
      of RPC including an RPC version number and buffer with the data
      structure. The slurmd however will not send the RPC in the new format
      until SLURM version 2.5.
      ed31e6c7
    • Morris Jette's avatar
      Add possible reason for failure to test · 3bdcf40f
      Morris Jette authored
      3bdcf40f
    • Morris Jette's avatar
      Merge branch 'slurm-2.3' · 644fc9a7
      Morris Jette authored
      644fc9a7
    • Morris Jette's avatar
      Minor test mods for old RedHat distro · 455283c2
      Morris Jette authored
      455283c2
    • Morris Jette's avatar
      Merge branch 'slurm-2.3' · f23f6ccc
      Morris Jette authored
      f23f6ccc
    • Morris Jette's avatar
      make test work better on different systems · 47aebf2c
      Morris Jette authored
      47aebf2c
    • Morris Jette's avatar
      result of autogen.sh · 304cccb6
      Morris Jette authored
      304cccb6
    • Morris Jette's avatar
      Merge branch 'slurm-2.3' · 8c905e93
      Morris Jette authored
      8c905e93
    • Morris Jette's avatar
      Modify Makefiles to support Hardening flags · a7e89e72
      Morris Jette authored
      a7e89e72
    • Morris Jette's avatar
      Cosmetic mods · fb4cabaa
      Morris Jette authored
      Replace some " \t" with just "\t" (that's a tab)
      fb4cabaa
  2. 20 Mar, 2012 7 commits
  3. 19 Mar, 2012 1 commit
  4. 18 Mar, 2012 3 commits
    • Mark A. Grondona's avatar
      task/cgroup: delete job step memcg instead of using force_empty · a93afcd1
      Mark A. Grondona authored
      The current task/cgroup memory code writes to force_empty at job step
      completion and then waits for the release agent to be triggered to
      remove the memcg. However, force_empty only causes clean cache pages
      to be dropped from the memcg and does not actually move charges to
      the parent [1].
      
      This has two unfortunate side-effects. First, pages that can't be
      dropped by force_empty are in-use and could stay that way indefinitely
      (e.g. system library that is in-use until just after force_empty
      completes). Thus, the step memcg never becomes 'empty' and the release
      agent is not activated. Second, cached pages that can be freed are
      likely associated with the job itself, and those files and libraries
      will have to be paged in again for subsequent job steps.
      
      In contrast, calling rmdir(2) on a memcg with no active tasks
      causes *all* current charges to move to parent, which is really what
      we want in this case. This allows cached libraries and binaries to
      stay resident and be associated with the job, and also ensures that
      the step memcg is removed immediately as the job step ends.
      
      Thus, this patch replaces the write to force_empty with a call
      to xcgroup_delete() on the step memcg, which in turn removes
      the memcg with rmdir(2).
      
      The functionality of this patch depends on the previous fix that
      uses xcgroup_move_process() to move slurmstepd to the root memcg.
      Otherwise, there will be leftover slurmstepd threads in the job
      step memcg, and the rmdir will fail with EBUSY.
      
       [1] Sec 4.3: http://www.kernel.org/doc/Documentation/cgroups/memory.txt
      a93afcd1
    • Mark A. Grondona's avatar
      task/cgroup: use xcgroup_move_process to move slurmstepd to root memcg · 2dd13506
      Mark A. Grondona authored
      In task_cgroup_memory_fini() the implementation attempts to move
      the existing slurmstepd task to the root memory cgroup by writing
      the result of getpid(2) to the root memory's 'task' file. This
      does not work, however, because slurmstepd is multi-threaded and
      thus only the main thread is moved.
      
      This patch replaces the explicit write to 'tasks' with a call to
      the new xcgroup_move_process() call, which handles moving all
      threads in the process.
      2dd13506
    • Mark A. Grondona's avatar
      xcgroup: add xcgroup_move_process helper function · aa912e4a
      Mark A. Grondona authored
      This patch adds a helper function to common/xcgroup.c to aid
      in moving processes between cgroups. If the cgroups.procs file
      is writable then writing the PID to that file is used, as this
      method moves all threads in a process atomically.
      
      If cgroups.procs is not writable, then each thread must be moved
      individually by walking the /proc/PID/task/ directory and writing
      each taskid individually to the 'tasks' file in the cgroup. The
      second method is racy if a process is concurrently creating
      threads, but it is better than the current method of just moving
      one of the process's threads.
      aa912e4a
  5. 16 Mar, 2012 12 commits