- 15 Jul, 2011 4 commits
-
-
Morris Jette authored
If the srun wrapper is executed when there is no job allocation, then create an allocation using salloc and call the srun wrapper again so that we can configure memory limits in aprun's execute line. Without this change, aprun would lack the memory allocation information and the task launch would fail if the job were allocated less than the full node.
-
Morris Jette authored
Prevent duplicate arguments to aprun from the srun.pl wrapper. This could happen if the command line included "--alps" arguments plus other arguments generated by the normal srun options. For example: srun -t 5 --alps="-t300" a.out specifies the job time limit in two places.
-
Danny Auble authored
-
Danny Auble authored
-
- 14 Jul, 2011 4 commits
-
-
Morris Jette authored
Set SLURM_MEM_PER_CPU or SLURM_MEM_PER_NODE environment variables for both interactive (salloc) and batch jobs if the job has a memory limit. For Cray systems also set CRAY_AUTO_APRUN_OPTIONS environment variable with the memory limit.
-
Morris Jette authored
Clarify in the srun (aprun wrapper) which options apply to an existing job allocation or new allocation and which are not applicable to Cray computers.
-
Danny Auble authored
asking for less than 1 mb per PE.
-
Morris Jette authored
Correction to srun man page. Get SIGINT working when srun spawns salloc.
-
- 13 Jul, 2011 3 commits
-
-
Morris Jette authored
For front-end configurations (Cray and IBM BlueGene), bind each batch job to a unique CPU to limit the damage which a single job can cause. Previously any single job could use all CPUs causing problems for other jobs or system daemons. This addresses a problem reported by Steve Trofinoff, CSCS.
-
Morris Jette authored
There was a table added, but it all ran together without being put into a table or list. I changed it to an unordered list.
-
Morris Jette authored
-
- 12 Jul, 2011 9 commits
-
-
Danny Auble authored
enforce memory limits.
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
man pages. Patch by Nancy Kritkausky, Bull.
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
Bill Brophy, Bull.
-
Morris Jette authored
Note the job and partition state file formats have changed and RPCs with information for jobs and partitions have changed.
-
- 11 Jul, 2011 1 commit
-
-
Morris Jette authored
-
- 08 Jul, 2011 1 commit
-
-
Morris Jette authored
code in smap was referencing a bad symbol name. The relevant code was commented out, but is needed for Bluegene/Q
-
- 07 Jul, 2011 5 commits
-
-
dannyauble authored
Typo fix to sample suspend and resume scripts.
-
Morris Jette authored
make the init_wires() definition and use consistent for BGL, BGP, and BGQ systems. No argument is needed or used in any system.
-
Danny Auble authored
up correctly when starting.
-
Danny Auble authored
-
Danny Auble authored
link to "disclaimer.html"
-
- 06 Jul, 2011 4 commits
-
-
Morris Jette authored
Fix bug in generic resource tracking of gres associated with specific CPUs. Resources were being over-allocated.
-
Morris Jette authored
Remove vestigial (unused) variable from smap.
-
Don Lipari authored
-
Morris Jette authored
Fix memory buffering bug if a AllowGroups parameter of a partition has 100 or more users. Patch by Andriy Grytsenko (Massive Solutions Limited).
-
- 05 Jul, 2011 3 commits
-
-
Morris Jette authored
Add cgroup support for device files in both the task/cgroup plugin and generic resource (GRES) logic. Based upon patch Yiannis Georgiou.
-
Morris Jette authored
When suspending a job, wait 2 seconds instead of 1 second between sending SIGTSTP and SIGSTOP. Some MPI implementation were not stopping within the 1 second delay.
-
Morris Jette authored
Add contribs/arrayrun tool providing support for job arrays. Contributed by Bjørn-Helge Mevik, University of Oslo. NOTE: Not currently packaged as RPM and manual file editing is required.
-
- 02 Jul, 2011 1 commit
-
-
Morris Jette authored
If a job needed to preempt other jobs to start and those jobs were not completed by the time of the next scheduling cycle, other jobs might be selected for preemption in that next cycle resulting in more jobs being preempted than necessary.
-
- 01 Jul, 2011 3 commits
-
-
Morris Jette authored
Previous logic reported the run time as the current time minus the job start time, ignoring any suspended time.
-
Morris Jette authored
Export the symbol s_p_hashtbl_destroy as slurm_s_p_hashtbl_destroy in libslurm so that external programs can link to libslurm and be able to reference select/cray without undefined symbols
-
Morris Jette authored
If the gres/gpu plugin is used, then set the CUDA_VISIBLE_DEVICE environment variable to "NoDevFiles" if the gres.conf file has no device files identified for generic recources (GRES) of type GPU. Otherwise set the device file sequence number(s).
-
- 30 Jun, 2011 2 commits
-
-
Morris Jette authored
-
Morris Jette authored
Enhancements to sched/backfill performance with select/cons_res plugin. Major improvements would be seen with large job counts. Based upon bf_build_row_bitmaps_2.2.6.patch patch from Bjørn-Helge Mevik, University of Oslo.
-