• Moe Jette's avatar
    salloc: add support for Cray · c036763e
    Moe Jette authored
    This adds support for execution of salloc on a local Cray system,
    disabling node sharing (still not supported on XT/XE).
    
    It further disables running salloc within salloc, as it leads to errors: since
    Cray uses process group / PAGG IDs for tracking its reservations, running
    salloc from within salloc invariably leads to a ALPS resource allocation error.
    
    Thirdly, it disable Cray node allocation on non-Cray systems, since this
    requires that the host on which salloc spawns the shell process is capable
    of Cray task launch.
    
    If it is not, then the remote slurmctld will reserve the requested nodes, but
    the local host runninc salloc will neither be able to confirm the ALPS 
    reservation (due to the absence of a local apbasil command), nor would it be
    able to run jobs on the compute nodes.
    
    To distinguish this case from general task launch (we use a frontend host where
    salloc could end up running jobs on different clusters, depending on the value
    exported via $SLURM_CONF), the following condition is tested:
    
     * Cray build support has been enabled (HAVE_CRAY);
     * the loaded slurm.conf uses select/cray (required on Cray hosts);
     * the local host does not have support for apbasil (HAVE_NATIVE_CRAY undefined).
    
    Since the 'apbasil' command is only available on native Cray systems, this
    combination of conditions seems sufficient to prevent accidentally using
    salloc on a host which does not support it.
    
    (For sbatch the case is different, since the job script runs on the remote host.)
    
    11_salloc.diff
    done with minor change for Cray emulation
    c036763e
To find the state of this project's repository at the time of any of these versions, check out the tags.