Commit 4652e982 authored by Morris Jette's avatar Morris Jette
Browse files

Correct job time limit for sched/backfil and job has QOS with NO_RESERVE flag

If sched/backfill starts a job with a QOS having NO_RESERVE and not job
time limit, start it with the partition time limit (or one year if the
partition has no time limit) rather than NO_VAL (140 year time limit);

If a standby job, which in this
case has the NO_RESERVE flag set, is submitted
without a time limit, and is backfilled, it
will get an EndTime waaayyyy into the future.

JobId=99 Name=cmdll
   UserId=eckert(1043) GroupId=eckert(1043)
   Priority=12083 Account=sa QOS=standby
   JobState=RUNNING Reason=None Dependency=(null)
   Requeue=1 Restarts=0 BatchFlag=1 ExitCode=0:0
   RunTime=00:00:14 TimeLimit=12:00:00 TimeMin=N/A
   SubmitTime=2012-12-20T11:49:36 EligibleTime=2012-12-20T11:49:36
   StartTime=2012-12-20T11:49:44 EndTime=2149-01-26T18:16:00

so I looked at the code in /src/plugins/sched/backfill:

                if (job_ptr->start_time <= now) {
                        int rc = _start_job(job_ptr, resv_bitmap);
                        if (qos_ptr && (qos_ptr->flags & QOS_FLAG_NO_RESERVE)){
                                job_ptr->time_limit = orig_time_limit;
                                job_ptr->end_time = job_ptr->start_time +
                                                    (orig_time_limit * 60);

Using the debugger I found that if the job does not have a specified
time limit, the job_ptr->time_limit is equal to NO_VAL when it hits
this code.
parent 4d5d3e9f
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment