• Morris Jette's avatar
    Fix backfill scheduler race condition · d8b18ff8
    Morris Jette authored
    Fix backfill scheduler race condition that could cause invalid pointer in
        select/cons_res plugin. Bug introduced in 15.08.9, commit:
        efd9d35e
    
    The scenario is as follows
    1. Backfill scheduler is running, then releases locks
    2. Main scheduling loop starts a job "A"
    3. Backfill scheduler resumes, finds job "A" in its queue and
       resets it's partition pointer.
    4. Job "A" completes and tries to remove resource allocation record
       from select/cons_res data structure, but fails to find it because
       it is looking in the table for the wrong partition.
    5. Job "A" record gets purged from slurmctld
    6. Select/cons_res plugin attempts to operate on resource allocation
       data structure, finds pointer into the now purged data structure
       of job "A" and aborts or gets SEGV
    Bug 2603
    d8b18ff8
To find the state of this project's repository at the time of any of these versions, check out the tags.