- 05 Jun, 2012 4 commits
-
-
Phil Eckert authored
I was doing some checking to find out why the the 2.4 branch and master branch of schedmd was not allowing held jobs to be modified, when attempting to do so, scontrol would return: slurm_update error: Requested partition configuration not available now I did some debugging and found that it was caused by code added to the tail end of job_limits_check() in job_mgr.c. It had this addition: } else if (job_ptr->priority == 0) { /* user or administrator hold */ fail_reason = WAIT_HELD; } It is causes all modifications done by scontrol on held jobs, to fail.
-
Don Lipari authored
I'd like to propose quieting down the job_mgr a tad. This is a refinement to: https://github.com/SchedMD/slurm/commit/30a986f4c600291876f4ec3e3949934512f2cba5
-
Danny Auble authored
a job kill timeout aren't always reported to the system. This is now handled by the runjob_mux plugin.
-
Danny Auble authored
-
- 04 Jun, 2012 1 commit
-
-
Rod Schultz authored
I'd like to add the following disclaimer to the documentation of the --mem option to the salloc/sbatch/srun commands. There is currently similar wording in the slurm.conf file, but I've received a bug report in which the memory limits were exceeded (until the next accounting poll). NOTE: Enforcement of memory limits currently requires enabling of accounting, which samples memory use on a periodic basis (data need not be stored, just collected). A task may exceed the memory limit until the next periodic accounting sample. Rod Schultz, Bull
-
- 01 Jun, 2012 4 commits
-
-
Danny Auble authored
-
Danny Auble authored
sub-blocks.
-
Danny Auble authored
-
Danny Auble authored
to make a larger small block and are running with sub-blocks.
-
- 31 May, 2012 2 commits
-
-
Danny Auble authored
function didn't always work correctly.
-
Danny Auble authored
rerun autogen.sh
-
- 30 May, 2012 3 commits
-
-
Danny Auble authored
the next step in the allocation only uses part of the allocation it gets the correct cnodes.
-
Morris Jette authored
-
Andy Wettstein authored
In etc/init.d/slurm move check for scontrol after sourcing /etc/sysconfig/slurm. Patch from Andy Wettstein, University of Chicago.
-
- 29 May, 2012 1 commit
-
-
Don Lipari authored
-
- 25 May, 2012 3 commits
-
-
Morris Jette authored
According to man slurm.conf, the default for NodeAddr is NodeName: "By default, the NodeAddr will be identical in value to NodeName." However, it seems the default is NodeHostname (when that differs from NodeName): With the following in slurmnodes.conf: Nodename=c0-0 NodeHostname=compute-0-0 ... I get NodeName=c0-0 Arch=x86_64 CoresPerSocket=2 CPUAlloc=0 CPUErr=0 CPUTot=4 Features=intel,rack0,hugemem Gres=(null) *** NodeAddr=compute-0-0 NodeHostName=compute-0-0 *** OS=Linux RealMemory=3949 Sockets=2 State=IDLE ThreadsPerCore=1 TmpDisk=10000 Weight=1027 BootTime=2012-05-08T15:07:08 SlurmdStartTime=2012-05-25T10:30:10 (This is with 2.4.0-0.pre4.) (We are planning to use cx-y instead of compute-x-y (the rocks default) on our next cluster, to save some typing.) -- Regards, Bjørn-Helge Mevik, dr. scient, Research Computing Services, University of Oslo
-
Rod Schultz authored
This change makes the code consistent with the documentation. Note that "bf_res=" will continue to be recognized for now. Patch from Rod Schultz, Bull.
-
Don Albert authored
I have implemented the changes as you suggested: using a "-dd" option to indicate that the display of the script is wanted, and setting both the "SHOW_DETAIL" and a new "SHOW_DETAIL2" flag. Since "scontrol" can be run interactively as well, I added a new "script" option to indicate that display of both the script and the details is wanted if the job is a batch job. Here are the man page updates for "man scontrol". For the "-d, --details" option: -d, --details Causes the show command to provide additional details where available. Repeating the option more than once (e.g., "-dd") will cause the show job command to also list the batch script, if the job was a batch job. For the interactive "details" option: details Causes the show command to provide additional details where available. Job information will include CPUs and NUMA memory allocated on each node. Note that on computers with hyperthreading enabled and SLURM configured to allocate cores, each listed CPU represents one physical core. Each hyperthread on that core can be allocated a separate task, so a job's CPU count and task count may differ. See the --cpu_bind and --mem_bind option descriptions in srun man pages for more information. The details option is currently only supported for the show job command. To also list the batch script for batch jobs, in addition to the details, use the script option described below instead of this option. And for the new interactive "script" option: script Causes the show job command to list the batch script for batch jobs in addition to the detail informa- tion described under the details option above. Attached are the patch file for the changes and a text file with the results of the tests I did to check out the changes. The patches are against SLURM 2.4.0-rc1. -Don Albert-
-
- 24 May, 2012 9 commits
-
-
Danny Auble authored
so acct_policy_job_runnable will always return true.
-
Danny Auble authored
Signed-off-by: Danny Auble <da@schedmd.com>
-
Danny Auble authored
compiling with --enable-debug
-
Jon Bringhurst authored
The purpose of this is so moab scripts and commands (such as 'checkjob') have consistent access to the SUBMITHOST variable.
-
Nathan Yee authored
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
- 23 May, 2012 11 commits
-
-
Danny Auble authored
-
Danny Auble authored
-
Morris Jette authored
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
Morris Jette authored
-
Don Lipari authored
Phil and I independently found a chunk of duplicate code that can be eliminated with no change to the functionality. This is against the 2.4 branch. Don
-
Morris Jette authored
Conflicts: src/slurmctld/reservation.c
-
Danny Auble authored
isn't up at the time the slurmctld starts, not running the priority/multifactor plugin, and then the database is started up later.
-
Morris Jette authored
-
- 22 May, 2012 2 commits
-
-
Danny Auble authored
-
Danny Auble authored
-