- 27 Mar, 2019 10 commits
-
-
Morris Jette authored
Remove reference to REQUEST_SIGNAL_PROCESS_GROUP in slurmstepd. It has been defunct since July 2013
-
Morris Jette authored
sort the expected and actual output for GRES APIs irrelevant. Depending upon the GRES plugins loaded (specifically gres/gpu plus gres/mps), the GRES records can be sorted by File name to insure the GRES records line up (the same position in both lists should refer to the same device file).
-
Alejandro Sanchez authored
-
Dominik Bartkiewicz authored
Bug 6750.
-
Danny Auble authored
-
Morris Jette authored
This logic could allocate to a job a GRES device with an availability count of zero.
-
Morris Jette authored
-
Morris Jette authored
This should only happen if there is flawed logic somewhere, but avoiding an abort is better than not.
-
Morris Jette authored
If the count of GPUs configured in slurm.conf and gres.conf differ and FastSchedule>=1 then the bitmap identifying the GPU allocation sent from slurmctld to slurmd will differ. Previously this resulted in CUDA_VISIBLE_DEVICES being set to NULL. Now it will be set correctly. bug 6725
-
Morris Jette authored
If slurmd finds GRES with files and slurmctld can't use them (i.e. slurm.conf has a GRES count of 0), then avoid trying to create zero length bitmaps in the GRES data structure. bug 6725
-
- 26 Mar, 2019 15 commits
-
-
Morris Jette authored
This makes the gres bitmap size equal to the number of records for shared gres (i.e. gres/mps), otherwise it is the gres count (i.e. gres/gpu). bug 6733
-
Morris Jette authored
if the device files for gres/gpu are out of order or grouped in an unordered fashion (e.g. "Name=gpu Files=/dev/nvidia[2,8,10]") then split the gres/gpu records to one record per file and make sure the gres/mps records are in an identical order. Required for matching gres/gpu and gres/mps records (one GPU can be allocated either as gres/gpu or as gres/mps, but not both, so we need to be able to match records in slurmctld).
-
Morris Jette authored
Coverity CID 197447
-
Alejandro Sanchez authored
Bug 6710.
-
Marshall Garey authored
Bug 6590.
-
Morris Jette authored
Make some tests better able to work with CR_ONE_TASK_PER_CORE
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
More testing required. This configuration is still disabled in select_cons_tres.c
-
Morris Jette authored
Add --ntasks-per-core option to execute line as needed
-
Morris Jette authored
Without this change, test7.17 was failing
-
Morris Jette authored
Cosmetic change No change in logic
-
Morris Jette authored
cosmetic change only
-
Morris Jette authored
This can happen on node failure
-
Morris Jette authored
Use stored pointer rather than pointer to pointer for cleaner code. No change in logic.
-
- 25 Mar, 2019 9 commits
-
-
Broderick Gardner authored
Workaround involves specifying the index name when modifying an existing index. Bug 6303
-
Broderick Gardner authored
The correct use of correct_query and query is clarified here. Also adds saving the index name to parsing so correct_query can be set to the correct "drop <index name>" for future query. Bug 6303
-
Morris Jette authored
This relocates a variable in order to move some common code into one place rather than repeat it in several locations. No changes to functionality, but simpler/less code.
-
Albert Gil authored
Bug 6680
-
Morris Jette authored
The bitmap size should equal the GPU count not the MPS count bug 6733
-
Morris Jette authored
-
Morris Jette authored
No change in logic other than additional logs
-
Morris Jette authored
-
Felip Moll authored
Note that using this feature in non-flat networks is not supported since the sender address is set depending on the hostname resolution in each node. Bug 6007
-
- 22 Mar, 2019 6 commits
-
-
Brian Christiansen authored
topo_cnt == the number of gres types or gres w/different topology. Bug 6725
-
Brian Christiansen authored
Bug 6725
-
Morris Jette authored
-
Alejandro Sanchez authored
-
Alejandro Sanchez authored
-
Marshall Garey authored
With schedulerparameters=defer and prolog scripts and/or spank plugins that take some time, jobs weren't starting within 2 seconds that tests 2.18 and 2.19 expected, causing these tests to fail. Bug 6670.
-