- 25 Aug, 2015 12 commits
-
-
Morris Jette authored
-
Morris Jette authored
Fix shutdown race condition that could cause the plugin to deadlock. Improve validation of burst buffer options. Hold job with an burst_buffer specification later discovered to be bad.
-
Morris Jette authored
-
Danny Auble authored
intended for the head of the tree.
-
Morris Jette authored
Handle out-of-band (i.e. outside of Slurm) persistent burst buffer creation/deletion with respect to their reservations, as used for limit enforcement.
-
Douglas Jacobsen authored
-
Nathan Yee authored
-
David Bigagli authored
-
Morris Jette authored
Avoid creating a duplicate burst buffer record due to a race condition between the "create_persistent" thread and the "load_state" thread.
-
Morris Jette authored
-
Morris Jette authored
Improve precisions of poll timeout logic
-
Morris Jette authored
Previously we reported an underflow of the node's comp_job_cnt (actually the message reported an underflow of the run_job_cnt, but was really checking the value of comp_job_cnt, in any case, the count is cleared when the node goes down as we are not counting on getting any responses from the down node).
-
- 24 Aug, 2015 9 commits
-
-
Morris Jette authored
-
Morris Jette authored
Major refactoring of burst buffer option parsing for interactive job support
-
Morris Jette authored
-
Morris Jette authored
The --bbf option reads the job's burst buffer options from a file rather than an inline option
-
Danny Auble authored
meaning meaning the limit is removed.
-
Danny Auble authored
-
Danny Auble authored
wouldn't get the correct protocol version to launch a step.
-
jette authored
wait for job TRES limit check before starting burst buffer allocation
-
jette authored
Trigger the job scheduler as soon as the last persistent buffer operation for a pending job completes. At that point, the job buffer can be allocated.
-
- 22 Aug, 2015 3 commits
-
-
Danny Auble authored
This reverts commit 7bdf6917. I believe c4545110 should fix this. I have verified this doesn't happen with just this commit.
-
Morris Jette authored
-
Morris Jette authored
Make the "pre_run" logic run asynchronously (in separate thread) and launch the application as soon as the function completes. The function can not be started until after the allocation has taken place and we know the nodes to be used.
-
- 21 Aug, 2015 16 commits
-
-
Morris Jette authored
Remove vestigial function bb_limit_test(), made redundant by TRES. Restore logic to set buffer access/type info Correct read config logic
-
Morris Jette authored
-
Morris Jette authored
Conflicts: src/common/slurm_protocol_api.c src/slurmctld/proc_req.c src/slurmd/slurmd/req.c
-
Morris Jette authored
-
Daniel Ahlin authored
This change adds the AuthInfo configuration parameter to the g_slurm_auth_get_uid() and g_slurm_auth_get_gid() functions. Then continuing on - src/common/slurm_auth.h shows that the following functions are intended to take the auth_info argument: extern void * g_slurm_auth_create( void *hosts, int timeout, char *auth_info ); extern int g_slurm_auth_verify( void *cred, void *hosts, int timeout, char *auth_info ); extern uid_t g_slurm_auth_get_uid( void *cred, char *auth_info ); extern gid_t g_slurm_auth_get_gid( void *cred, char *auth_info ); g_slurm_auth_create and g_slurm_auth_verify seems to be OK now - but g_slurm_auth_get_uid and g_slurm_auth_get_gid are not (most cases will work anyway since the munge auth plugin will only use auth_info if the cred has not yet been verified - and in many instances it has - still I would assume passing the information to be the safe thing to do). I've attached two blind patches that just replaces null with slurm_get_auth_info() for these two functions.
-
Morris Jette authored
This is an refinement of commit 47f96cae with improved logging and a similar fix to another place with similar logic bug 1880
-
David Bigagli authored
-
Morris Jette authored
Fix gang scheduling/preemption issue that could cancel job at startup. I have not been able to reproduce the reported problem, but this should prevent the reported problem. bug 1880
-
Morris Jette authored
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
Brian Christiansen authored
-
Danny Auble authored
-
Danny Auble authored
-