- 05 Dec, 2013 2 commits
-
-
Danny Auble authored
-
Morris Jette authored
Add SLURM_CLUSTER_NAME to environment variables passed to PrologSlurmctld, Prolog, EpilogSlurmctld, and Epilog.
-
- 04 Dec, 2013 9 commits
-
-
Morris Jette authored
PrologSlurmctld, EpilogSlurmctld, MailProg, etc.
-
Morris Jette authored
-
Morris Jette authored
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
container that never had a pid added to it. (The job ended before it began)
-
Danny Auble authored
-
Morris Jette authored
-
Morris Jette authored
Previous logic never reopened the file, preventing proper functioning of logrotate.
-
- 03 Dec, 2013 21 commits
-
-
Morris Jette authored
Conflicts: src/slurmctld/job_mgr.c
-
Morris Jette authored
Use hash function to locate job records for improved performance.
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
Morris Jette authored
Change partition write lock to a read lock as we use a different mechanism for hidden partitions in getting individual jobs.
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
This is a correction in the logic of commit 3f4b2d51 on launch failures.
-
jette authored
Make it work for more system types Do not delete files if the build fails
-
Danny Auble authored
here instead of cp please change this :).
-
Danny Auble authored
-
Danny Auble authored
-
David Gloe authored
-
David Gloe authored
-
Morris Jette authored
Conflicts: src/slurmctld/job_scheduler.c
-
Morris Jette authored
Correct logic returning remaining job dependencies in job information reported by scontrol and squeue. Eliminates vestigial descriptors with no job ID values (e.g. "afterany"). As depdencies are removed, the job ID values were removed from the strings, but not the descriptors. This eliminates both. It also checks the full job ID to make sure we do not remove "afterany:1234" when job "123" completes.
-
Danny Auble authored
-
Danny Auble authored
handle any job_fail calls after the fact since it will result in deadlock otherwise.
-
Danny Auble authored
-
Danny Auble authored
-
- 02 Dec, 2013 8 commits
-
-
Morris Jette authored
-
Morris Jette authored
Conflicts: NEWS doc/man/man5/cgroup.conf.5
-
Morris Jette authored
Add a check to make sure that the job completion RPC from a slurmstepd match that node that the batch job is running on. This would not be the case of for a job started on a node if that node's slurmd fails, but the slurmstepd keeps running. The job could then be requeued and generate a completion RPC from both slurmstepd daemons (one per node). This logic will ignore the job complete RPC from the node NOT currently running the batch job.
-
Morris Jette authored
-
David Bigagli authored
-
Morris Jette authored
Fix race condition on batch job termination that could result in a job exit code of 0xfffffffe if the slurmd on node zero registers its active jobs at the same time that slurmstepd is recording the job's exit code. but 535
-
Danny Auble authored
-
Danny Auble authored
-