- 21 Mar, 2012 34 commits
-
-
Mark A. Grondona authored
Add new handle_spank_mode() function in slurmstepd to handle when slurmstepd is called with "spank prolog" or "spank epilog". In this function, the slurmd_conf_lite is read to handle reinitializing the log facility as defined by slurmd config.
-
Mark A. Grondona authored
Factor out the read and write of the packed slurmd_conf_lite data between slurmd and slurmstepd. This simplifies the code in which that data is handled, and will allow for other callers in the future.
-
Mark A. Grondona authored
The spank_job_prolog() and spank_job_epilog() spank calls need to be run in a different address space from slurmd. This not allows reinitializing the spank plugin stack on each run of the prolog or epilog, but also ensures that any static data in plugins does not propagate to each invocation of the job prolog and epilog (e.g. global variables). Additionally, it is much safer to run these plugins in a new process because we may be calling prolog/epilog for multiple jobs at the same time. This patch runs spank_job_prolog() or spank_job_epilog() from slurmstepd when slurmstepd is invoked as slurmstepd spank [prolog|epilog] The environment variables SLURM_JOBID and SLURM_UID are used to set the jobid and uid for the prolog/epilog. Spank plugin options may also be passed through the current environment.
-
Mark A. Grondona authored
Move special handling of slurmstepd cmdline to a function for future expansion.
-
Mark A. Grondona authored
Add slurm_spank_job_prolog and slurm_spank_job_epilog callbacks to the spank API, to be called just before the job prolog and epilog scripts are executed. These callbacks are not active until the hooks spank_job_prolog and spank_job_epilog are added to slurmd.
-
Mark A. Grondona authored
Add new spank context "job script" for use during job prolog/epilog.
-
Mark A. Grondona authored
-
Mark A. Grondona authored
Add support for slurm_spank_slurmd_init and slurm_spank_slurmd_exit symbols in spank plugins, to be called at slurmd startup and shutdown. These are not functional yet until slurmd calls spank_slurmd_init() and spank_slurmd_exit().
-
Mark A. Grondona authored
Currently spank_get_item and spank_job_control* are not valid in slurmd context. Handle this case in relevant fucntions.
-
Mark A. Grondona authored
Prepare for spank plugins run in the context of slurmd daemon by adding a new S_CTX_SLURMD context type.
-
Mark A. Grondona authored
The spank_set_remote_options_env() function is not used anywhere except internal to plugstack.c, so remove it from plugstack.h. Then redefine it to take a spank_stack argument so that it doesn't refer to the global_spank_stack. Finally rename to spank_stack_set_remote_options_env() to clarify the intent.
-
Mark A. Grondona authored
Refactor the post_opt handling code embedded in _spank_init() into a spank_stack_post_opt() function, then call this in remote context from a new spank_init_remote() function.
-
Mark A. Grondona authored
Instead of trying to handle missing plugstack.conf early in the code, just treat missing plugstack.conf the same as an empty config.
-
Mark A. Grondona authored
Move struct spank_stack initialization code into a spank_stack_init() function so that it can be called from multiple call sites.
-
Mark A. Grondona authored
Simplify code in _do_call_stack() by extracting case statement to assign current callback symbol to its own function. Since all spank functions have the same prototype we can then use the same code to call _all_ callbacks, reducing greatly the number of lines of code required.
-
Mark A. Grondona authored
Consolidate common code in spank_getenv, spank_setenv, spank_unsetenv which checks for validity of the current context, spank handle, etc.
-
Mark A. Grondona authored
Consilidate checks for correct spank context in spank_job_control* functions to avoid code duplication.
-
Mark A. Grondona authored
The use of globals in plugstack.c is cumbersome and prevents the future expansion of spank plugins, e.g. calling spank plugins from multiple contexts within the same process or reinitializing the spank plugin state. This patch consolidates the current globals (spank_stack, spank_ctx, spank_optval, and option_cache) into a global "spank stack" structure and expands many of the functions internal to plugstack.c to operate on a struct spank_stack instead of globally.
-
Mark A. Grondona authored
There was likely a typo/thinko/patcho in the handling of the return code from _do_call_stack(SPANK_INIT_POST_OPT) in _spank_init in "remote" context. This error caused spank_init() to always succeed, since the test less than zero would always return 0 or 1.
-
Mark A. Grondona authored
Avoid loading the same plugin more than once in plugstack.c. Most likely this will be a configuration error, so we should catch it early. If the same .so appears in the plugin stack more than once, it is likely to cause very strange errors, since dlopen() will only map the library a single time.
-
Morris Jette authored
Change the owner of slurmctld and slurmdbd log files to the appropriate user. Without this change the files will be created by and owned by the user starting the daemons (likely user root).
-
Morris Jette authored
-
Morris Jette authored
CRAY: Fix support for configuration with SlurmdTimeout=0 (never mark node that is DOWN in ALPS as DOWN in SLURM).
-
Morris Jette authored
-
Morris Jette authored
in the tightly coupled functions slurmd:stepd_completion and slurmstepd:_handle_completion, a jobacct structure is send from the main daemon to the step daemon to provide the statistics of the children slurmstepd and do the aggregation. The methodology used to send the structure is the use of jobacct_gather_g_{setinfo,getinfo} over a pipe (JOBACCT_DATA_PIPE). As {setinfo,getinfo} use a common internal lock and reading or writing to a pipe is equivalent to holding a lock, slurmd and slurmstepd have to avoid using both setinfo and getinfo over a pipe or deadlock situations can occured. For example : slurmd(lockforread,write)/slurmstepd(write,lockforread). This patch remove the call to jobacct_gather_g_setinfo in slurmd and the call to jobacct_gather_g_getinfo in slurmstepd ensuring that slurmd only do getinfo operations over a pipe and slurmstepd only do setinfo over a pipe. Instead jobacct_gather_g_{pack,unpack} are used to marshall/unmarshall the data for transmission over the pipe. Patch by Matthieu Hautreux, CEA. The patch committed here is a variation on the work by Matthieu. Specifically, the logic is added to slurmstepd to read a new format of RPC including an RPC version number and buffer with the data structure. The slurmd however will not send the RPC in the new format until SLURM version 2.5.
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
Replace some " \t" with just "\t" (that's a tab)
-
- 20 Mar, 2012 6 commits
-
-
Morris Jette authored
Improve support for overlapping advanced reservations. Patch from Bill Brophy, Bull.
-
Morris Jette authored
-
Morris Jette authored
Added PriorityFlags configuration parameter
-
Morris Jette authored
task/cgroup: minor job step memcg fixes
-
Morris Jette authored
Improve task binding logic by making fuller use of HWLOC library, especially with respect to Opteron 6000 series processors. Work contributed by Komoto Masahiro.
-
Carles Fenoy authored
-