1. 10 Mar, 2011 5 commits
  2. 09 Mar, 2011 8 commits
  3. 08 Mar, 2011 16 commits
  4. 07 Mar, 2011 6 commits
  5. 06 Mar, 2011 5 commits
    • Moe Jette's avatar
      salloc: disable --no-shell mode · ca906680
      Moe Jette authored
      Since "aprun" is used on Cray instead of srun, the --no-shell option does not
      make any difference: with or without this option, the ALPS reservation is made, 
      and since it is confirmed using the SID of the current shell, aprun will run 
      even if the BASIL_RESERVATION_ID is not set.
      
      NB: the patch aborts with an error message. If deciding to turn this into a
          warning, and continue processing, opt.no_shell should be disabled, since
          otherwise interactive mode (and thus job control) is disabled.
      ca906680
    • Moe Jette's avatar
      slurmctld: remove dead code · 7b5a5dee
      Moe Jette authored
      return_hostlist is not populated in validate_nodes_via_front_end,
      hence never printed out.
      7b5a5dee
    • Moe Jette's avatar
      select/cray: typos and outdated comments · 10f20cfc
      Moe Jette authored
      This 
       * removes outdated and no longer applicable comments regarding
         consecutive node numbering (dating from an earlier revision);
       * fixes a typo and clarifies condition on XT/SeaStar systems.
      10f20cfc
    • Moe Jette's avatar
      libalps: use proper type for timestamps · d089c7c9
      Moe Jette authored
      This fixes an inconsistency: time_t is not necessarily u32, use a separate
      routine to parse the absolute value and use proper time_t type.
      
      Also tidied up code where possible.
      d089c7c9
    • Moe Jette's avatar
      select/cray: handling errors in do_basil_release() · 70869e06
      Moe Jette authored
      This reduces the amount of error text printed on failure of do_basil_release():
       * parameter failures are caught by the existing calls to error(),
       * internal (ALPS) errors are printed by basil_release(),
       * there is no need to return additional error information via errno,
       * functions calling select_g_job_fini() just interpret the error, but no
         further action is taken, hence it is not necessary to indicate failure
         more than once.
      
      The following shows how setting SLURM_ERROR/errno produces unnecessarily long error text:
      
       [2011-02-09T18:19:51] debug2: Processing RPC: REQUEST_CANCEL_JOB_STEP uid=21215
       [2011-02-09T18:19:51] error: PERMANENT ALPS BACKEND error: ALPS error: apsched: No entry for resId 286
       [2011-02-09T18:19:51] error: releasing ALPS resId 286 for JobId 2940 FAILED with -5
       [2011-02-09T18:19:51] error: select_g_job_fini(2940): No error
      
      With the patch, only						       
       [2011-02-09T18:19:51] error: PERMANENT ALPS BACKEND error: ALPS error: apsched: No entry for resId 286
      would be printed, which is sufficient to diagnose the problem (resId 286 had been
      terminated by ALPS internally, after not receiving a confirmation quickly enough).
      70869e06