- 14 Oct, 2011 2 commits
-
-
Danny Auble authored
-
Morris Jette authored
Cray - Fix for srun.pl parsing to avoid adding spaces between option and argument (e.g. "-N2" parsed properly without changing to "-N 2").
-
- 11 Oct, 2011 2 commits
-
-
Morris Jette authored
Cray: Add support for job reservations with node IDs that are not in numeric order. Fix for Bugzilla #5.
-
jette authored
Prevent job hold by operator or account coordinator of his own job from being an Administrator Hold rather than User Hold by default.
-
- 07 Oct, 2011 1 commit
-
-
Morris Jette authored
Prevent slurmctld crashing with divide by zero with a configuration of MaxMemPerCPU=0.
-
- 06 Oct, 2011 1 commit
-
-
Morris Jette authored
Add a node state flag of CLOUD and save/restore NodeAddr and NodeHostName information for nodes with a flag of CLOUD. Major update to elastic computing document.
-
- 05 Oct, 2011 3 commits
-
-
Morris Jette authored
-
Danny Auble authored
block happens correctly now.
-
Morris Jette authored
Add the ability to update a node's NodeAddr and NodeHostName with scontrol. Also enable setting a node's state to "future" using scontrol.
-
- 04 Oct, 2011 3 commits
-
-
Morris Jette authored
Major re-write of the CPU Management User and Administrator Guide (web page) by Martin Perry, Bull.
-
Morris Jette authored
If a job can not run due to QOS or association limits, then do not cancel the job, but leave it pending in a system held state (priority = 1). The job will run when its limits or the QOS/association limits change. Based upon a patch by Phil Ekcert (LLNL).
-
Danny Auble authored
booted block. -- pass 1, more work needs to be done.
-
- 03 Oct, 2011 2 commits
-
-
Danny Auble authored
-
Morris Jette authored
Prevent associations from being delete if it has any jobs in running, pending or suspended state. Previous code prevented this only for running jobs.
-
- 30 Sep, 2011 2 commits
-
-
Morris Jette authored
Fix bugs in sched/backfill with respect to QOS reservation support and job time limits. Patch from Alejandro Lucero Palau (Barcelona Supercomputer Center).
-
Morris Jette authored
Fix to GRES allocation logic when resources are associated with specific CPUs on a node. Patch from Steve Trofinoff, CSCS.
-
- 29 Sep, 2011 5 commits
-
-
Danny Auble authored
(i.e. 1-9,0 instead of 0-9). The bug would cause 'sacct -N nodename' to not give correct results on these systems.
-
Danny Auble authored
is in an error state, won't deny jobs.
-
Danny Auble authored
-
Danny Auble authored
restarts of the slurmctld.
-
Danny Auble authored
admin sets the state to error.
-
- 28 Sep, 2011 4 commits
-
-
Morris Jette authored
Change default value of StateSaveLocation configuration parameter from /tmp to /var/spool, where the files are less likely to be purged.
-
Morris Jette authored
Modify slurmdbd.conf parsing to accept DebugLevel strings (quiet, fatal, info, etc.) in addition to numeric values. The parsing of slurm.conf was modified in the same fashion for SlurmctldDebug and SlurmdDebug values. The output of sview and "scontrol show config" was also modified to report those values as strings rather than numeric values.
-
Morris Jette authored
Do not treat the absence of a gres.conf file as a fatal error on systems configured with GRES, but set GRES counts to zero. These counts can be Counts can be altered by node_config_load() in the gres plugin.
-
Danny Auble authored
-
- 27 Sep, 2011 3 commits
-
-
Morris Jette authored
Add the ability to reboot all compute nodes after they become idle. The RebootProgram configuration parameter must be set and an authorized user must execute the command "scontrol reboot_nodes". Patch from Andriy Grytsenko (Massive Solutions Limited).
-
Morris Jette authored
Interpret a reservation with Nodes=ALL and a Partition specification as reserving all nodes within the specified partition rather than all nodes on the system. Based upon patch by Phil Eckert (LLNL).
-
Morris Jette authored
An a flag record to event triggers and add support for a flag value of "PERM" for permanent triggers, triggers which are only removed when the slurmctld daemon is cold-started or the trigger is explicitly removed.
-
- 26 Sep, 2011 2 commits
-
-
Danny Auble authored
-
Morris Jette authored
Many cosmetic modifications to eliminate warning message from GCC version 4.6 compiler, mostly due to unused variables.
-
- 20 Sep, 2011 3 commits
-
-
Morris Jette authored
Permit administrator to change a job's QOS to any value without validating the job's owner has permission to use that QOS. Based upon patch by Phil Eckert (LLNL).
-
Morris Jette authored
Modify advance reservation to accept multiple specific block sizes rather than a single node count. This is very important for BlueGene systems.
-
Morris Jette authored
Modify srun's SIGINT handling logic timer (two SIGINTs within one second) to be based microsecond rather than second timer.
-
- 19 Sep, 2011 1 commit
-
-
Danny Auble authored
-
- 15 Sep, 2011 2 commits
-
-
Morris Jette authored
Avoid clearing a job's reason from JobHeldAdmin or JobHeldUser when it is otherwise updated using scontrol or sview commands. Patch based upon work by Phil Eckert (LLNL).
-
Morris Jette authored
Do not remove the backup slurmctld's pid file when it assumes control, only when it actually shuts down. Patch from Andriy Grytsenko (Massive Solutions Limited).
-
- 14 Sep, 2011 3 commits
-
-
Danny Auble authored
-
Danny Auble authored
variable wasn't initialized in the job structure making it so that job wouldn't run.
-
Danny Auble authored
-
- 13 Sep, 2011 1 commit
-
-
Danny Auble authored
-