Permit job with invalid QOS to run if QOS set by administrator
About a year ago I submitted a modification that you incorporated into SLURM 2.4, which was to allow an admin to modify a job to use a QOS even though the user did not have access to the QOS. However, I must have tested it without having the Accounting set to enforce QOS's. So, if an admin modifies a job to a QOS they don't have access to, it will be modified, but the job will result in a state of InvalidQOS, which is reasonable, since this would handle the case where a user has their QOS removed. A problem, however, is that even though the scheduler won't schedule the job, backfill still will. One approach would be to fix backfill to be consistent with the scheduler (which should probably occur regardless), but my thought would be to modify the scheduler to allow the QOS as long as it was set by an admin, since that was the intent of the modification to begin with. I believe it would only take a single line to change, just adding a check on the job_ptr->limit_set_qos, to make sure it was set by an admin: if (job_ptr->qos_id) { slurmdb_association_rec_t *assoc_ptr; assoc_ptr = (slurmdb_association_rec_t *)job_ptr->assoc_ptr; if (assoc_ptr && !bit_test(assoc_ptr->usage->valid_qos, job_ptr->qos_id) && !job_ptr->limit_set_qos) { info("sched: JobId=%u has invalid QOS", job_ptr->job_id); xfree(job_ptr->state_desc); job_ptr->state_reason = FAIL_QOS; continue; } else if (job_ptr->state_reason == FAIL_QOS) { xfree(job_ptr->state_desc); job_ptr->state_reason = WAIT_NO_REASON; } } Phil
Please register or sign in to comment