- 11 May, 2012 10 commits
-
-
Danny Auble authored
qos or any combination of those the correct thing happens. If the job is using a QOS or Partition that only works inside a reservation then deign the update if only removing the reservation.
-
Danny Auble authored
we will read the old reservation used so if using a partition or qos that only allows it to be used in a reservation we won't fail.
-
Danny Auble authored
updating job so as to make sure the reservation is set correctly when doing the other checks
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
previous versions of SLURM
-
Danny Auble authored
-
Bill Brophy authored
Original Patch from Bill Brophy (Group Bull)
-
Bill Brophy authored
if a reservation is also requested in the job. Original Patch from Bill Brophy (Group Bull)
-
- 10 May, 2012 1 commit
-
-
Morris Jette authored
-
- 09 May, 2012 6 commits
-
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
-
Don Lipari authored
The symptom is that SLURM schedules lower priority jobs to run when higher priority, dependent jobs have their dependencies satisfied. This happens because dependent jobs still have a priority of 1 when the job queue is sorted in the schedule() function. The proposed fix forces jobs to have their priority updated when their dependencies are satisfied.
-
Danny Auble authored
-
Morris Jette authored
-
- 07 May, 2012 4 commits
-
-
Morris Jette authored
Job priority of 1 is no longer used as a special case in slurm v2.4
-
Morris Jette authored
-
Morris Jette authored
-
Don Lipari authored
The commit 8b14f388 on Jan 19, 2011 is causing problems with Moab cluster-scheduled machines. Under this case, Moab hands off every job submitted immediately to SLURM which gets a zero priority. Once Moab schedules the job, Moab raises the job's priority to 10,000,000 and the job runs. When you happen to restart the slurmctld under such conditions, the sync_job_priorities() function runs which attempts to raise job priorities into a higher range if they are getting too close to zero. The problem as I see it is that you include the "boost" for zero priority jobs. Hence the problem we are seeing is that once the slurmctld is restarted, a bunch of zero priority jobs are suddenly eligible. So there becomes a disconnect between the top priority job Moab is trying to start and the top priority job SLURM sees. I believe the fix is simple: diff job_mgr.c~ job_mgr.c 6328,6329c6328,6331 < while ((job_ptr = (struct job_record *) list_next(job_iterator))) < job_ptr->priority += prio_boost; --- while ((job_ptr = (struct job_record *) list_next(job_iterator))) { if (job_ptr->priority) job_ptr->priority += prio_boost; } Do you agree? Don
-
- 04 May, 2012 4 commits
-
-
Nathan Yee authored
-
Danny Auble authored
developments.
-
Bjrn-Helge Mevik authored
from Bjørn-Helge Mevik
-
Danny Auble authored
-
- 03 May, 2012 5 commits
-
-
Morris Jette authored
-
Matthieu Hautreux authored
-
Matthieu Hautreux authored
Here is the way to reproduce it : [root@cuzco27 georgioy]# salloc -n64 -N4 --exclusive salloc: Granted job allocation 8 [root@cuzco27 georgioy]#srun -r 0 -n 30 -N 2 sleep 300& [root@cuzco27 georgioy]#srun -r 1 -n 40 -N 3 sleep 300& [root@cuzco27 georgioy]# srun: error: slurm_receive_msg: Zero Bytes were transmitted or received srun: error: Unable to create job step: Zero Bytes were transmitted or received
-
Morris Jette authored
-
Danny Auble authored
honored correctly. I also put in nice notes where the values aren't to be altered.
-
- 02 May, 2012 10 commits
-
-
Morris Jette authored
* Specify MinNodes via "scontrol update partition". * Whenever the zero-node allocation ends, the frontend node is left in a state of COMPLETING until scontrol reconfigure is issued (this doesn't appear to impact the performance of the front end node as other jobs can still be submitted including other zero-node jobs).
-
Danny Auble authored
system of different size than actual hardware.
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
handled.
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
Martin Perrry authored
cpus in task/cgroup plugin
-