1. 21 Mar, 2012 37 commits
    • Mark A. Grondona's avatar
      slurmd: Always set conf->stepd_loc to slurmstepd path · 6722705f
      Mark A. Grondona authored
      Greatly simplify ability of code to get at current slurmstepd path
      by setting slurmd's conf->stepd_loc to the default slurmstped path
      if that path was not overridden on the command line.
      
      This allows slurmd code to directly use conf->stepd_loc, instead of
      requiring the duplicated code that created the default slurmstepd
      path if conf->stepd_loc was not set at each call site.
      6722705f
    • Mark A. Grondona's avatar
      use exponential backoff in waitpid_timeout · 8e201175
      Mark A. Grondona authored
      Make waitpid_timeout() return more quickly when the child exits before
      1s but after the initial call to waitpid(2).
      8e201175
    • Mark A. Grondona's avatar
      abstract timed waitpid from run_script to separate function · 08162bb1
      Mark A. Grondona authored
      Abstract the code for a waitpid(2) with timeout into a waitpid_timeout()
      function for future use from other callers. For now, the function goes
      into src/slurmd/common/run_script.c, since that is the original use
      of the functionality.
      08162bb1
    • Mark A. Grondona's avatar
      slurmstepd: refactor spank prolog/epilog code · e409986a
      Mark A. Grondona authored
      Add new handle_spank_mode() function in slurmstepd to handle
      when slurmstepd is called with "spank prolog" or "spank epilog".
      In this function, the slurmd_conf_lite is read to handle reinitializing
      the log facility as defined by slurmd config.
      e409986a
    • Mark A. Grondona's avatar
      slurmd/slurmstepd: factor out read/write of slurmd_conf_lite · 00e71ef3
      Mark A. Grondona authored
      Factor out the read and write of the packed slurmd_conf_lite
      data between slurmd and slurmstepd. This simplifies the code
      in which that data is handled, and will allow for other callers
      in the future.
      00e71ef3
    • Mark A. Grondona's avatar
      slurmstepd: Add new mode to run spank job prolog/epilog · 1e01c729
      Mark A. Grondona authored
      The spank_job_prolog() and spank_job_epilog() spank calls need
      to be run in a different address space from slurmd. This not allows
      reinitializing the spank plugin stack on each run of the prolog or
      epilog, but also ensures that any static data in plugins does not
      propagate to each invocation of the job prolog and epilog (e.g. global
      variables). Additionally, it is much safer to run these plugins
      in a new process because we may be calling prolog/epilog for multiple
      jobs at the same time.
      
      This patch runs spank_job_prolog() or spank_job_epilog() from slurmstepd
      when slurmstepd is invoked as
      
       slurmstepd spank [prolog|epilog]
      
      The environment variables SLURM_JOBID and SLURM_UID are used to set
      the jobid and uid for the prolog/epilog. Spank plugin options may
      also be passed through the current environment.
      1e01c729
    • Mark A. Grondona's avatar
      slurmstepd: Move handling of cmdline to a function · a136a5ab
      Mark A. Grondona authored
      Move special handling of slurmstepd cmdline to a function for
      future expansion.
      a136a5ab
    • Mark A. Grondona's avatar
      spank: add prolog and epilog callbacks to spank api · d3a6ec23
      Mark A. Grondona authored
      Add slurm_spank_job_prolog and slurm_spank_job_epilog callbacks
      to the spank API, to be called just before the job prolog and epilog
      scripts are executed.
      
      These callbacks are not active until the hooks spank_job_prolog and
      spank_job_epilog are added to slurmd.
      d3a6ec23
    • Mark A. Grondona's avatar
      spank: Add S_TYPE_JOB_SCRIPT context for prolog/epilog · 21773e76
      Mark A. Grondona authored
      Add new spank context "job script" for use during job prolog/epilog.
      21773e76
    • Mark A. Grondona's avatar
      d405c1ed
    • Mark A. Grondona's avatar
      spank: Add spank callbacks for slurmd · 069b164c
      Mark A. Grondona authored
      Add support for slurm_spank_slurmd_init and slurm_spank_slurmd_exit
      symbols in spank plugins, to be called at slurmd startup and shutdown.
      
      These are not functional yet until slurmd calls spank_slurmd_init()
      and spank_slurmd_exit().
      069b164c
    • Mark A. Grondona's avatar
      spank: handle slurmd context in some callbacks · 63765b58
      Mark A. Grondona authored
      Currently spank_get_item and spank_job_control* are not valid in
      slurmd context. Handle this case in relevant fucntions.
      63765b58
    • Mark A. Grondona's avatar
      spank: Add context type for slurmd · ab388e1e
      Mark A. Grondona authored
      Prepare for spank plugins run in the context of slurmd daemon by
      adding a new S_CTX_SLURMD context type.
      ab388e1e
    • Mark A. Grondona's avatar
      spank: remove spank_set_remote_options_env · d436efcd
      Mark A. Grondona authored
      The spank_set_remote_options_env() function is not used anywhere except
      internal to plugstack.c, so remove it from plugstack.h. Then redefine
      it to take a spank_stack argument so that it doesn't refer to the
      global_spank_stack. Finally rename to spank_stack_set_remote_options_env()
      to clarify the intent.
      d436efcd
    • Mark A. Grondona's avatar
      spank: refactor intialization code · 66cfa45a
      Mark A. Grondona authored
      Refactor the post_opt handling code embedded in _spank_init() into
      a spank_stack_post_opt() function, then call this in remote context
      from a new spank_init_remote() function.
      66cfa45a
    • Mark A. Grondona's avatar
      spank: handle missing plugstack.conf · 3344092a
      Mark A. Grondona authored
      Instead of trying to handle missing plugstack.conf early in the code,
      just treat missing plugstack.conf the same as an empty config.
      3344092a
    • Mark A. Grondona's avatar
      spank: abstract spank_stack initialization code · 443aee4d
      Mark A. Grondona authored
      Move struct spank_stack initialization code into a spank_stack_init()
      function so that it can be called from multiple call sites.
      443aee4d
    • Mark A. Grondona's avatar
      spank: consolidate common code in _do_call_stack · e4e3baab
      Mark A. Grondona authored
      Simplify code in _do_call_stack() by extracting case statement
      to assign current callback symbol to its own function. Since all
      spank functions have the same prototype we can then use the same
      code to call _all_ callbacks, reducing greatly the number of lines
      of code required.
      e4e3baab
    • Mark A. Grondona's avatar
      spank: consilidate checks for spank_get/set/unsetenv calls · 61cd1115
      Mark A. Grondona authored
      Consolidate common code in spank_getenv, spank_setenv, spank_unsetenv
      which checks for validity of the current context, spank handle, etc.
      61cd1115
    • Mark A. Grondona's avatar
      spank: consolidate error checks in job control functions · c3227f9a
      Mark A. Grondona authored
      Consilidate checks for correct spank context in spank_job_control*
      functions to avoid code duplication.
      c3227f9a
    • Mark A. Grondona's avatar
      spank: consolidate globals in plugstack.c · 2eb0b999
      Mark A. Grondona authored
      The use of globals in plugstack.c is cumbersome and prevents the
      future expansion of spank plugins, e.g. calling spank plugins from
      multiple contexts within the same process or reinitializing the
      spank plugin state.
      
      This patch consolidates the current globals (spank_stack, spank_ctx,
      spank_optval, and option_cache) into a global "spank stack" structure
      and expands many of the functions internal to plugstack.c to operate
      on a struct spank_stack instead of globally.
      2eb0b999
    • Mark A. Grondona's avatar
      spank: fix handling of remote spank_init_post_opt · 3a522459
      Mark A. Grondona authored
      There was likely a typo/thinko/patcho in the handling of the
      return code from _do_call_stack(SPANK_INIT_POST_OPT) in _spank_init
      in "remote" context. This error caused spank_init() to always
      succeed, since the test less than zero would always return 0 or 1.
      3a522459
    • Mark A. Grondona's avatar
      spank: refuse to load the same plugin more than once · 7a60bf95
      Mark A. Grondona authored
      Avoid loading the same plugin more than once in plugstack.c.
      Most likely this will be a configuration error, so we should
      catch it early. If the same .so appears in the plugin stack
      more than once, it is likely to cause very strange errors,
      since dlopen() will only map the library a single time.
      7a60bf95
    • Morris Jette's avatar
      change owner of slurmctld and slurmdbd log files · 3470c651
      Morris Jette authored
      Change the owner of slurmctld and slurmdbd log files to the appropriate
      user. Without this change the files will be created by and owned by the
      user starting the daemons (likely user root).
      3470c651
    • Morris Jette's avatar
      Merge branch 'slurm-2.3' · e78802d3
      Morris Jette authored
      e78802d3
    • Morris Jette's avatar
      CRAY: Fix support for SlurmdTimeout=0 · 4dd9e697
      Morris Jette authored
      CRAY: Fix support for configuration with SlurmdTimeout=0 (never mark
      node that is DOWN in ALPS as DOWN in SLURM).
      4dd9e697
    • Morris Jette's avatar
      Add delay to test for job info propagation · 7636f0f2
      Morris Jette authored
      7636f0f2
    • Morris Jette's avatar
      Modify the step completion RPC between slurmd and slurmstepd · ed31e6c7
      Morris Jette authored
      in the tightly coupled functions slurmd:stepd_completion and
      slurmstepd:_handle_completion, a jobacct structure is
      send from the main daemon to the step daemon to provide
      the statistics of the children slurmstepd and do the aggregation.
      
      The methodology used to send the structure is the use of
      jobacct_gather_g_{setinfo,getinfo} over a pipe (JOBACCT_DATA_PIPE).
      As {setinfo,getinfo} use a common internal lock and reading
      or writing to a pipe is equivalent to holding a lock, slurmd and
      slurmstepd have to avoid using both setinfo and getinfo over a
      pipe or deadlock situations can occured. For example :
      slurmd(lockforread,write)/slurmstepd(write,lockforread).
      
      This patch remove the call to jobacct_gather_g_setinfo in slurmd
      and the call to jobacct_gather_g_getinfo in slurmstepd ensuring
      that slurmd only do getinfo operations over a pipe and slurmstepd
      only do setinfo over a pipe. Instead jobacct_gather_g_{pack,unpack}
      are used to marshall/unmarshall the data for transmission over the
      pipe.
      Patch by Matthieu Hautreux, CEA.
      
      The patch committed here is a variation on the work by Matthieu.
      Specifically, the logic is added to slurmstepd to read a new format
      of RPC including an RPC version number and buffer with the data
      structure. The slurmd however will not send the RPC in the new format
      until SLURM version 2.5.
      ed31e6c7
    • Morris Jette's avatar
      Add possible reason for failure to test · 3bdcf40f
      Morris Jette authored
      3bdcf40f
    • Morris Jette's avatar
      Merge branch 'slurm-2.3' · 644fc9a7
      Morris Jette authored
      644fc9a7
    • Morris Jette's avatar
      Minor test mods for old RedHat distro · 455283c2
      Morris Jette authored
      455283c2
    • Morris Jette's avatar
      Merge branch 'slurm-2.3' · f23f6ccc
      Morris Jette authored
      f23f6ccc
    • Morris Jette's avatar
      make test work better on different systems · 47aebf2c
      Morris Jette authored
      47aebf2c
    • Morris Jette's avatar
      result of autogen.sh · 304cccb6
      Morris Jette authored
      304cccb6
    • Morris Jette's avatar
      Merge branch 'slurm-2.3' · 8c905e93
      Morris Jette authored
      8c905e93
    • Morris Jette's avatar
      Modify Makefiles to support Hardening flags · a7e89e72
      Morris Jette authored
      a7e89e72
    • Morris Jette's avatar
      Cosmetic mods · fb4cabaa
      Morris Jette authored
      Replace some " \t" with just "\t" (that's a tab)
      fb4cabaa
  2. 20 Mar, 2012 3 commits