- 03 Apr, 2019 21 commits
-
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
It was failing due to an Epilog, but could also fail when run in parallel with other jobs.
-
Morris Jette authored
This includes information about how to get a clean HWLOC report.
-
Morris Jette authored
Without this change I was able to fairly consistently cause "scontrol shutdown" to NOT cause the slurmd to exit. 1. Start slurmd and slurmctld 2. Immediately execute "scontrol reconfig" and "scontrol shutdown"
-
Morris Jette authored
Format changes only
-
Morris Jette authored
log message and comment format changes
-
Alejandro Sanchez authored
Bug 5851.
-
Danny Auble authored
# Conflicts: # slurm/slurm.h.in
-
Danny Auble authored
-
Alejandro Sanchez authored
This prevents rebuilding a job's dependency string when it has at least one invalid (never satisfied) dependency, no matter if such invalid dependency has already been purged (after MinJobAge) or not. This can be useful to track down the culprit invalid dependencies even after they are gone from ctld's in-memory job list. The flag is cleared upon a successful job dependency update or after another job in the dependency list has been satisfied if such list is composed with the '?' symbol (OR'ed). Bug 5851.
-
Alejandro Sanchez authored
Job dependencies separated by "?" (OR'ed) should make the dependant job be independent as soon as any of the dependencies are resolved to be satisfied. Without this patch, if an invalid (non satisfiable) dependency was resolved before a satisfiable one, then the dependant job would never become independent, even after the satisfiable one got eventually resolved. Bug 5851.
-
Alejandro Sanchez authored
No functional change, just preparement for a following commit with an actual fix. Bug 5851.
-
Felip Moll authored
The response of the XCC raw command is always 16 bytes, we log it and return if we don't get an answer of this size. Bug 6743
-
Morris Jette authored
-
Morris Jette authored
If GRES configuration data is unavailable from gres.conf, then use the node's "Gres=" information slurm.conf. This will eliminate or minimize the gres.conf file in many situations. bug 6761
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
They were a bit too verbose for my taste
-
- 02 Apr, 2019 6 commits
-
-
Felip Moll authored
In 0e149092 not setting the variable when job was not requesting any gres was considered a bug. The cuda API will use all devices if the variable is not set. If it is set to some unknown or empty value, it will use no devices. This variable should be used only for testing purposes and ConstrainDevices=yes in cgroup is recommended. Bug 6412
-
Felip Moll authored
gres plugins will setup environment for every gres in the system even if the job has not requested it. Bug 6412
-
Felip Moll authored
than one GRES of the same name but different type" This reverts f7fca7ba Bug 6412
-
Morris Jette authored
initial work needed for bug 6761 support
-
Morris Jette authored
comment format and change some log messages
-
Morris Jette authored
this problem was triggered with a configuation of PrologFlags=Alloc,Contain
-
- 01 Apr, 2019 2 commits
-
-
Morris Jette authored
This eliminates a slurmctld error message when a job shrinks to size zero. There is no need to re-compute the CPU count and the job_resources node_bitmap is empty. Logic works fine without this change if job size shrinks, but not to size zero. bug 6472
-
Morris Jette authored
When a job size was reset to zero, this error message was printed: slurm_allocation_lookup: Job/step already completing or completed which may lead the user to believe the operation failed when it worked as planned. Now it prints this: To reset Slurm environment variables, execute For bash or sh shells: . ./slurm_job_43565_resize.sh For csh shells: source ./slurm_job_43565_resize.csh Where the reset scripts contain zero node count information: export SLURM_NODELIST="" export SLURM_JOB_NODELIST="" export SLURM_NNODES=0 export SLURM_JOB_NUM_NODES=0 export SLURM_JOB_CPUS_PER_NODE="" unset SLURM_NPROCS unset SLURM_NTASKS unset SLURM_TASKS_PER_NODE
-
- 31 Mar, 2019 4 commits
-
-
Brian Christiansen authored
-
Brian Christiansen authored
Continuation of 2764f3fd Bug 6589
-
Brian Christiansen authored
Continuation of 9a243a1a Bug 6592
-
Brian Christiansen authored
-
- 30 Mar, 2019 1 commit
-
-
Morris Jette authored
Many comments were modified to follow Linux kernel standard Many log messages were using the old function name and now print __func__ instead A few log messages lacked the function name and those were added
-
- 29 Mar, 2019 2 commits
-
-
Morris Jette authored
No change in any logic
-
Morris Jette authored
This adds logic to validate the count of GRES by device Type and not just Name and modifies the data structures as needed for consistency within slurmctld.
-
- 28 Mar, 2019 3 commits
-
-
Morris Jette authored
-
Broderick Gardner authored
Removed linear search, replaced with direct record references and a hashmap. This is faster and avoids potential collisions between assoc id's and user id's. Bug 4811
-
Broderick Gardner authored
Fixed existing usages as well. Bug 4811
-
- 27 Mar, 2019 1 commit
-
-
Morris Jette authored
Coverity CID 197448 bug 6303
-