- 01 May, 2015 10 commits
-
-
Morris Jette authored
Change the scancel command to always use the job_id string based API. Add retry logic on the job_id string logic. Add more checking for error codes and used appropriate exit codes. Use NO_VAL rather than "-1" for unset signal value.
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
-
Jens Svalgaard Kohrt authored
-
Morris Jette authored
Change the temporary file names used by two tests to include the test ID number, so we can see where they came from if left around.
-
Morris Jette authored
-
Morris Jette authored
The jobcomp/elasticsearch plugin's Makefile.am file was accidentally not added to GIT...
-
Morris Jette authored
In the course of testing some scancel changes, a bunch of tests generated "FAILURE" messages due to job cancellation failures, but the tests reported "SUCCESS" at the end and an exit code of zero. This patch adds a checks for the return value of the "cancel_job" procedure.
-
- 30 Apr, 2015 17 commits
-
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
In slurmctld communication agent, make the thread timeout be the configured value of MessageTimeout (or 30 seconds, whichever is larger) rather than 30 seconds.
-
Morris Jette authored
-
Morris Jette authored
Conflicts: src/scancel/scancel.c src/scancel/scancel.h
-
Morris Jette authored
Fix scancel bug which could return an error on attempt to signal a job step. A simple "scancel 12.3" to signal a specific job step would fail. Adding another option (say "-i", "--partion=", etc.) would fix this.
-
Morris Jette authored
check for more error conditions and avoid a memory leak if an error occurs.
-
Morris Jette authored
Log when job state information is not being archived, but being cached. Print an error after each 100 jobs.
-
Morris Jette authored
The plugin version number contents have changed in Slurm v15.08.
-
Morris Jette authored
Recent scancel mods resulted in a duplicate error when trying to signal a non-existent job. This removes the second error.
-
David Bigagli authored
-
David Bigagli authored
-
Morris Jette authored
No change in logic
-
David Bigagli authored
-
Morris Jette authored
-
David Bigagli authored
-
David Bigagli authored
-
- 29 Apr, 2015 13 commits
-
-
Morris Jette authored
Previous logic would not recognize a job ID specification with a job array task ID of "*" (e.g. "123_*") to indicated all job array tasks. Previous logic would stop any parsing after the closing bracket on a job array specification (e.g. "123_[4-6] 234" would not see the "234"). Improve logging of job ID specifications (i.e. use job array specification).
-
Morris Jette authored
Conflicts: NEWS
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
Modify slurmctld's parsing of a job_id string for the job_signal and job_requeue calls to treat a job ID value of "#_*" as representing all tasks in a job ID number "#". Previously treated as invalid input. Also set the last_job_update time so that if a pending job is killed, then that is reported immediately by "squeue -i#" (previously it may keep reporting stale date.
-
Morris Jette authored
Trying to avoid having technical questions sent to "sales@schedmd.com"
-
Morris Jette authored
-
jette authored
This avoids letting the queued scheduling thread from starting if the main scheduling loop is still running.
-
Danny Auble authored
-
Danny Auble authored
This reverts commit f9ebf5ad. Conflicts: src/plugins/select/alps/basil_interface.c
-
Danny Auble authored
-
Danny Auble authored
before ending the job.
-
Danny Auble authored
will make it so the slurmctld will not signal the apid's in a batch job. Instead it relies on the rpc coming from the slurmctld to kill the job to end things correctly.
-