- 22 Jun, 2012 1 commit
-
-
Danny Auble authored
-
- 21 Jun, 2012 4 commits
-
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
The underlying problem is in the sched plugin logic in SLURM v2.4
-
- 20 Jun, 2012 4 commits
-
-
Danny Auble authored
but not node count the node count is correctly figured out.
-
Morris Jette authored
Without this fix, gang scheduling mode could start without creating a list resulting in an assert when jobs are submitted.
-
Morris Jette authored
This change permits a user to get a zero size allocation by specifying a task count of zero with no node count specification.
-
Morris Jette authored
-
- 18 Jun, 2012 3 commits
-
-
Danny Auble authored
-
Danny Auble authored
packing the step layout structure.
-
Danny Auble authored
we must use a small block instead of a shared midplane block.
-
- 15 Jun, 2012 2 commits
-
-
Danny Auble authored
-
Morris Jette authored
-
- 13 Jun, 2012 4 commits
-
-
Danny Auble authored
still messages we find when we poll but haven't given it back to the real time yet.
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
- 12 Jun, 2012 3 commits
-
-
Danny Auble authored
-
Nathan Yee authored
-
Danny Auble authored
-
- 11 Jun, 2012 2 commits
-
-
Danny Auble authored
-
Martin Perry authored
-
- 07 Jun, 2012 1 commit
-
-
Danny Auble authored
-
- 05 Jun, 2012 4 commits
-
-
Phil Eckert authored
I was doing some checking to find out why the the 2.4 branch and master branch of schedmd was not allowing held jobs to be modified, when attempting to do so, scontrol would return: slurm_update error: Requested partition configuration not available now I did some debugging and found that it was caused by code added to the tail end of job_limits_check() in job_mgr.c. It had this addition: } else if (job_ptr->priority == 0) { /* user or administrator hold */ fail_reason = WAIT_HELD; } It is causes all modifications done by scontrol on held jobs, to fail.
-
Don Lipari authored
I'd like to propose quieting down the job_mgr a tad. This is a refinement to: https://github.com/SchedMD/slurm/commit/30a986f4c600291876f4ec3e3949934512f2cba5
-
Danny Auble authored
a job kill timeout aren't always reported to the system. This is now handled by the runjob_mux plugin.
-
Danny Auble authored
-
- 04 Jun, 2012 1 commit
-
-
Rod Schultz authored
I'd like to add the following disclaimer to the documentation of the --mem option to the salloc/sbatch/srun commands. There is currently similar wording in the slurm.conf file, but I've received a bug report in which the memory limits were exceeded (until the next accounting poll). NOTE: Enforcement of memory limits currently requires enabling of accounting, which samples memory use on a periodic basis (data need not be stored, just collected). A task may exceed the memory limit until the next periodic accounting sample. Rod Schultz, Bull
-
- 01 Jun, 2012 4 commits
-
-
Danny Auble authored
-
Danny Auble authored
sub-blocks.
-
Danny Auble authored
-
Danny Auble authored
to make a larger small block and are running with sub-blocks.
-
- 31 May, 2012 2 commits
-
-
Danny Auble authored
function didn't always work correctly.
-
Danny Auble authored
rerun autogen.sh
-
- 30 May, 2012 3 commits
-
-
Danny Auble authored
the next step in the allocation only uses part of the allocation it gets the correct cnodes.
-
Morris Jette authored
-
Andy Wettstein authored
In etc/init.d/slurm move check for scontrol after sourcing /etc/sysconfig/slurm. Patch from Andy Wettstein, University of Chicago.
-
- 29 May, 2012 1 commit
-
-
Don Lipari authored
-
- 25 May, 2012 1 commit
-
-
Morris Jette authored
According to man slurm.conf, the default for NodeAddr is NodeName: "By default, the NodeAddr will be identical in value to NodeName." However, it seems the default is NodeHostname (when that differs from NodeName): With the following in slurmnodes.conf: Nodename=c0-0 NodeHostname=compute-0-0 ... I get NodeName=c0-0 Arch=x86_64 CoresPerSocket=2 CPUAlloc=0 CPUErr=0 CPUTot=4 Features=intel,rack0,hugemem Gres=(null) *** NodeAddr=compute-0-0 NodeHostName=compute-0-0 *** OS=Linux RealMemory=3949 Sockets=2 State=IDLE ThreadsPerCore=1 TmpDisk=10000 Weight=1027 BootTime=2012-05-08T15:07:08 SlurmdStartTime=2012-05-25T10:30:10 (This is with 2.4.0-0.pre4.) (We are planning to use cx-y instead of compute-x-y (the rocks default) on our next cluster, to save some typing.) -- Regards, Bjørn-Helge Mevik, dr. scient, Research Computing Services, University of Oslo
-