- 20 Aug, 2018 40 commits
-
-
Morris Jette authored
rather than just report GRES_IDX (index) info, report GRES count info as well, since it can vary from node-to-node with cons_tres
-
Morris Jette authored
This fixes a couple of bugs related to allocating GRES when there is no associated topology, including adding support for the --tres-per-job option
-
Morris Jette authored
for current cons_tres testing and future use
-
Morris Jette authored
-
Morris Jette authored
Enforce configured default values for DefMemPerGPU and DefCPUPerGPU
-
Morris Jette authored
spread a job over multiple nodes if needed to satisfy mem-per-gpu specification
-
Morris Jette authored
-
Morris Jette authored
Force job to span nodes when appropriate
-
Morris Jette authored
The last bit of logic to talk with the GPUs is still needed
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
this includes a new regression test
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
This fixes a couple of bugs in commit 0e4874e19490a24 1. round up core count for job as needed (i.e. if job needs 3 CPUs per task and there are 2 CPUs per core, the job needs 2 cores rather than 1) 2. fix some bad logic of available cores on socket 0 is 0 3. failed to set exit_code to 1 on a expect test failure
-
Morris Jette authored
correction to logic for explicit hostname specification on job submit bug introduced in commit 0e4874e19490a24fb54961ef89176a3e8f55952b
-
Morris Jette authored
also add a regression test for this scheduling logic bug 4584
-
Morris Jette authored
Add that desired GPU count is actually allocated to a job based upon --gpus, --gpus-per-node, --gpus-per-socket, and --gpus-per-task options
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
this bug exists with all select plugins. if a job has been allocated gres and the gres have either topology or type information and the slurmctld daemon restarts (while the job is running), then when the job ends gres underflow errors will be generated. the problem is due to the slurmctld not having gres topology or type information available at restart time so that it can not update counters. the overhead of updating those counters at node registration time is high, so we just avoid generating the errors in this case. note: this bug is not specific to cons_tres and exists in earlier versions of slurm.
-
Morris Jette authored
-
Morris Jette authored
if the step does not explicity specify a gres-per-node value, then the step will be allocated gres identical to that allocated to the job
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
also relocate tres_mc create/destroy location so information can be access from additional locations and to reduce overhead of creating it multiple times
-
Morris Jette authored
-
Morris Jette authored
it is not going to work in practice
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
Modify gres_plugin_job_alloc() to allocate pre-selected GRES. Add fields to job GRES data structure to define selected GRES before job allocation time.
-
Tim Wickberg authored
-
Tim Wickberg authored
-
Michael Hinton authored
MySQL permits up to 64-character database names, but Slurm was truncating at 33-characters. If we exceed this limit, let the mysql_query fail and give the admin a chance to sort it out, rather than truncating and then failing to query against the un-truncated name later on. While here correct the fatal() message. Bug 5586.
-