1. 22 Apr, 2011 13 commits
  2. 21 Apr, 2011 4 commits
  3. 20 Apr, 2011 6 commits
  4. 19 Apr, 2011 6 commits
  5. 18 Apr, 2011 6 commits
  6. 17 Apr, 2011 5 commits
    • Moe Jette's avatar
      adds a bunch of cosmetic changes from Gerrit. · 24a02ef9
      Moe Jette authored
      24a02ef9
    • Moe Jette's avatar
      job_submit/lua: expose priority-related job_record fields to lua interface · b22385b7
      Moe Jette authored
      This allows scripted modification of job records, by exposing the
       * job_ptr->direct_set_prio
       * job_ptr->priority
       * job_ptr->details->nice
      fields to the job_submit.lua script.
      b22385b7
    • Moe Jette's avatar
      slurmctld: allow job_submit plugin to modify/set the priority/nice values · 98d7059f
      Moe Jette authored
      This allows the job_submit plugin to directly set priority values. If it
      assigns a priority value different from 0 and NO_VAL, the priority is marked
      as "fixed" via job_ptr->direct_set_prio.
      
      To enable this, the permission check for directly set priority is now done
      before calling the job_submit plugin, which in addition also allows to 
      influence the nice value of the job via the plugin.
      98d7059f
    • Moe Jette's avatar
      job_submit: allow job_submit plugin to put job on hold · fecf4769
      Moe Jette authored
      This reorders the code of _job_create() to the effect that the job_submit plugin
      is able to put a job on hold (by setting the job priority to 0). To prevent the
      user from releasing such jobs, jobs put on hold by the job_submit plugin use
      WAIT_HELD rather than WAIT_HELD_USER.
      fecf4769
    • Moe Jette's avatar
      select/cray: unconditionally release reservations · 6d36b50c
      Moe Jette authored
      This increases robustness in releasing ALPS reservations. Previously
      the reservation was only released through
       * select_g_job_fini() for interactive (salloc) sessions;
       * batch_finish() by slurmstepd for batch sessions.
      
      This introduces a single point of failure for batch jobs, since a failure
      of batch_finish() would mean that the reservation could only be released
      much later, through the detection of orphaned ALPS reservations in
      basil_inventory().
      
      For batch jobs that terminate normally this means that the RELEASE method is
      called twice: first in job_complete(), and then in batch_finish(). The Basil
      1.2 design document by Ben Landsteiner (dated 15 Feb 2011) suggests in section
      3.3.5 repeated calls of RELEASE as one possible way of improving the response
      of the RELEASE method. There will be additional "entry not found" messages in
      the apschedMMDD logs, but (due to the preceding patch) not in the SLURM logs.
      
      For jobs that have to be terminated (e.g. job_timed_out, job_requeue, job_fail),
      this patch will mean that the RELEASE is called much sooner and thus is 
      expected to improve efficiency.
      
      For interactive salloc sessions that are cancelled via scancel, there is now
      no longer a warning message about the no longer existing ALPS reservation
      (since the release happens first through select_p_job_signal and then through
       job_complete -> deallocate_nodes -> select_p_job_fini).
      6d36b50c