- 28 Oct, 2016 3 commits
-
-
Danny Auble authored
a job could be accounted for more than it should in the _decay_thread inside the priority/multifactor plugin. Before the end_time_exp wasn't stored for the job which was what was used to determine if the job was already processed or not. In 16.05 we were able to fix this mostly, but for the TRES numbers they could get accounted for multiple times. Since a pack was needed to fix this we had to wait until 17.02.
-
Danny Auble authored
-
Danny Auble authored
more time than should be allowed would be accounted for. This only happened on jobs in the completing state when the slurmctld was shutdown. This will also be enhanced in 17.02 as the job's end_time_exp is not stored which is needed to determine if the job has already been through the decay_thread at end of job. Bug 3162
-
- 27 Oct, 2016 37 commits
-
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
bug 3139
-
Danny Auble authored
-
Danny Auble authored
# Conflicts: # META
-
Danny Auble authored
-
Danny Auble authored
issue with gang scheduling. Bug 3211
-
Brian Christiansen authored
MAX_BUF_SIZE is a uint32_t so comparing size (int) to it didn't make sense.
-
Tim Wickberg authored
Unhook it from the build, and remove relevant section from slurm.spec file as well.
-
Brian Christiansen authored
-
Tim Wickberg authored
-
Tim Wickberg authored
-
Tim Wickberg authored
-
Tim Wickberg authored
-
Tim Wickberg authored
-
Brian Christiansen authored
-
Brian Christiansen authored
Federated submissions
-
Brian Christiansen authored
e.g. allocation failure: Unspecified error
-
Brian Christiansen authored
-
Brian Christiansen authored
get_next_job_id() was returning a local id and then the fed_mgr was turning that into a fed job id. This was a problem because get_next_job_id() couldn't check to see if an existing job already had the fed job id. It was only checking for the local job id. This was exposed in tests that did a reconfigure and the reconfigure loaded in a old job_id_sequence so that the next job got an id that was already being used.
-
Brian Christiansen authored
The logic to talk to the correct compute nodes still needs to be implemented. It will come at a later date.
-
Brian Christiansen authored
-
Brian Christiansen authored
Will submit using federation submission logic. Scheduling logic to come.
-
Brian Christiansen authored
to make sure job ptr is accessed within locks.
-
Brian Christiansen authored
In prep for refactoring _slurm_rpc_submit_batch_job to make sure the job_ptr is accessed within locks.
-
Brian Christiansen authored
-
Brian Christiansen authored
-
Brian Christiansen authored
-
Brian Christiansen authored
-
Brian Christiansen authored
-
Brian Christiansen authored
-
Brian Christiansen authored
It was picking a higher weighted federation over lower weighted federations because it had a earlier starttime. This shouldn't happen because that's what the weights are for. e.g. will_run_resp for fed1: start:2016-10-13T15:19:47 sys_usage:0.00 weight:2 will_run_resp for fed2: start:2016-10-13T15:19:48 sys_usage:0.00 weight:1 will_run_resp for fed3: start:2016-10-13T15:19:48 sys_usage:0.00 weight:1 Earliest cluster:fed1 time:1476393587 now:1476393588 Submitted federated job 67119254 to fed1(self)
-
Brian Christiansen authored
-
Brian Christiansen authored
-
Brian Christiansen authored
fedorigin fedoriginraw fedsiblings fedsiblingsraw
-
Brian Christiansen authored
-
Brian Christiansen authored
-