- 23 Mar, 2012 5 commits
-
-
Morris Jette authored
-
Morris Jette authored
-
https://github.com/grondo/slurmMorris Jette authored
Merge branch 'spank-epilog-for-schedmd' of https://github.com/grondo/slurm into grondo-spank-epilog-for-schedmd Conflicts: NEWS
-
Morris Jette authored
Fix bug in allocating GRES that are associated with specific CPUs. In some cases the code allocated first available GRES to job instead of allocating GRES accessible to the specific CPUs allocated to the job.
-
Morris Jette authored
-
- 22 Mar, 2012 11 commits
-
-
Morris Jette authored
-
Morris Jette authored
Mistakenly changed jobacct_gather_g_getinfo() into jobacct_gather_g_setinfo()
-
Morris Jette authored
This avoids conflicts with the "info" function.
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
This fixes a race condition in error handling logic added a couple of days ago for slurmd/slurmstepd communications in commit https://github.com/SchedMD/slurm/commit/ed31e6c7fdb5bcc1b0f0a8e3cbf5327604e64887
-
Morris Jette authored
-
Matthieu Hautreux authored
Access to secured FS often requires to have a valid token in the user context. With SLURM, this token can be obtained using one of the possible pluggable architecture, SPANK or PAM. IO setup of SLURM can require to access secured FS (stdout/stderr files). This patch ensures that pluggable frameworks are activated and called prior to IO setup and that IO are terminated before calling pluggable framework exit calls.
-
Matthieu Hautreux authored
set PR_DUMPABLE as soon as possible, especially before any plugins are loaded. This will allow someone debugging to get a coredump.
-
Matthieu Hautreux authored
To prepare io_setup integration in _fork_all_tasks, error handling must be transformed to not always return SLURM_ERROR but be prepared to return SLURM_SUCCESS in case of an io_setup error.
-
- 21 Mar, 2012 24 commits
-
-
Mark A. Grondona authored
-
Mark A. Grondona authored
Update spank(8) man page with documentation of new functionality, including new callbacks: slurm_spank_slurmd_init, slurm_spank_slurmd_exit, slurm_spank_job_prolog, and slurm_spank_job_epilog, as well as the new spank_option_getopt() call for use in option processing by plugins.
-
Mark A. Grondona authored
Update spank header comments and documentation. Add #defines for new slurmd and job prolog/epilog contexts, so that their existence can be tested at compile time.
-
Mark A. Grondona authored
Add a new call to process spank options from a plugin. The spank_option_getopt() function will search the current spank environment for use of the option passed as an argument. The current option cache, and the local environment are checked for the use of the given spank option. This call is an alternative to use of a global variable in combination with the option callback, and is also needed for processing options in the isolated contexts of slurm_spank_job_prolog() and slurm_spank_job_epilog().
-
Mark A. Grondona authored
Add spank_clear_remote_options_env() to clear any spank options passed through the environment after they are no longer needed. This is done in slurmd after running the spank job prolog || epilog, as well as in the spank_post_opt function, after the env has been searched for spank variables.
-
Mark A. Grondona authored
Always set spank options in the environemnt and spank job environment to ensure that used options are propagated to the job prolog and epilog. (Previously, spank options were set in the environment only in allocator context)
-
Mark A. Grondona authored
In slurmd and job prolog/epilog contexts, avoid loading plugins that have no callbacks in the context in which they are loaded. That is for slurmd, if there are no slurm_spank_slurmd_init or slurm_spank_slurmd_exit callbacks, there is no reason to keep the current plugin loaded.
-
Mark A. Grondona authored
We now want to return error on failure of either spank prolog/epilog or regular prolog/epilog scripts, so add a common function _run_job_script to handle return of shared error code. For now, we continue to run the normal prolog or epilog even if the spank prolog/epilog fail. In the future, a failure the spank prolog/epilog may short-circuit the run of the normal scripts.
-
Mark A. Grondona authored
Call spank_job_prolog() and spank_job_epilog() at prolog/epilog time by invoking "slurmstepd spank [prolog|epilog]" The prolog and epilog spank plugin hooks are not called within the virtual address space of slurmd for at least a couple of reasons, including 1. Plugins dlopened in the address space of slurmd cannot be dlopened a second time. Therefore, static and global state in the DSO may be "dirty" in that some state may be preserved from the last epilog or prolog call, or even from the slurmd_init callback. 2. The prolog and epilog need to be guaranteed reentrant. The safest way to guarantee this is to ensure prolog/epilog hooks are called from a separate address space. 3. To satisfy "principle of least surprise" we want to have new plugins installed run their prolog/epilog hooks on the next job, just as if an update to the prolog/epilog script was made. The only way to guarantee this is to reload the spank plugin stack from plugstack.conf on each run. Because of #1 above, this needs to be done in a separate process.
-
Mark A. Grondona authored
Greatly simplify ability of code to get at current slurmstepd path by setting slurmd's conf->stepd_loc to the default slurmstped path if that path was not overridden on the command line. This allows slurmd code to directly use conf->stepd_loc, instead of requiring the duplicated code that created the default slurmstepd path if conf->stepd_loc was not set at each call site.
-
Mark A. Grondona authored
Make waitpid_timeout() return more quickly when the child exits before 1s but after the initial call to waitpid(2).
-
Mark A. Grondona authored
Abstract the code for a waitpid(2) with timeout into a waitpid_timeout() function for future use from other callers. For now, the function goes into src/slurmd/common/run_script.c, since that is the original use of the functionality.
-
Mark A. Grondona authored
Add new handle_spank_mode() function in slurmstepd to handle when slurmstepd is called with "spank prolog" or "spank epilog". In this function, the slurmd_conf_lite is read to handle reinitializing the log facility as defined by slurmd config.
-
Mark A. Grondona authored
Factor out the read and write of the packed slurmd_conf_lite data between slurmd and slurmstepd. This simplifies the code in which that data is handled, and will allow for other callers in the future.
-
Mark A. Grondona authored
The spank_job_prolog() and spank_job_epilog() spank calls need to be run in a different address space from slurmd. This not allows reinitializing the spank plugin stack on each run of the prolog or epilog, but also ensures that any static data in plugins does not propagate to each invocation of the job prolog and epilog (e.g. global variables). Additionally, it is much safer to run these plugins in a new process because we may be calling prolog/epilog for multiple jobs at the same time. This patch runs spank_job_prolog() or spank_job_epilog() from slurmstepd when slurmstepd is invoked as slurmstepd spank [prolog|epilog] The environment variables SLURM_JOBID and SLURM_UID are used to set the jobid and uid for the prolog/epilog. Spank plugin options may also be passed through the current environment.
-
Mark A. Grondona authored
Move special handling of slurmstepd cmdline to a function for future expansion.
-
Mark A. Grondona authored
Add slurm_spank_job_prolog and slurm_spank_job_epilog callbacks to the spank API, to be called just before the job prolog and epilog scripts are executed. These callbacks are not active until the hooks spank_job_prolog and spank_job_epilog are added to slurmd.
-
Mark A. Grondona authored
Add new spank context "job script" for use during job prolog/epilog.
-
Mark A. Grondona authored
-
Mark A. Grondona authored
Add support for slurm_spank_slurmd_init and slurm_spank_slurmd_exit symbols in spank plugins, to be called at slurmd startup and shutdown. These are not functional yet until slurmd calls spank_slurmd_init() and spank_slurmd_exit().
-
Mark A. Grondona authored
Currently spank_get_item and spank_job_control* are not valid in slurmd context. Handle this case in relevant fucntions.
-
Mark A. Grondona authored
Prepare for spank plugins run in the context of slurmd daemon by adding a new S_CTX_SLURMD context type.
-
Mark A. Grondona authored
The spank_set_remote_options_env() function is not used anywhere except internal to plugstack.c, so remove it from plugstack.h. Then redefine it to take a spank_stack argument so that it doesn't refer to the global_spank_stack. Finally rename to spank_stack_set_remote_options_env() to clarify the intent.
-
Mark A. Grondona authored
Refactor the post_opt handling code embedded in _spank_init() into a spank_stack_post_opt() function, then call this in remote context from a new spank_init_remote() function.
-