- 21 Apr, 2016 7 commits
-
-
Morris Jette authored
Some portions of tests 21.30 and 21.34 failed with accounting and priority basic. These changes disable portions of those tests as needed based upon configuration.
-
Brian Christiansen authored
-
Brian Christiansen authored
The basic plugin doesn't do a decay. So it just needs to remove the all of the allocated minutes.
-
Brian Christiansen authored
-
Morris Jette authored
This add some additional logic to the commit made to version 15.08 as needed for operation with version 16.04. Specifically, once a persistent burst buffer is created in versioin 16.04 the create flag is cleared to prevent attempts at duplicate buffer create. A new "use" persistent burst buffer is added for our needs (indicating that a DataWarp "paths" operation is required). The first commit is 905ac850
-
Morris Jette authored
burst_buffer/cray - Don't call Datawarp "paths" function if script includes only create or destroy of persistent burst buffer. Some versions of Datawarp software return an error for such scripts, causing the job to be held. bug 2624
-
Morris Jette authored
No change in any logic or definitions
-
- 20 Apr, 2016 7 commits
-
-
Morris Jette authored
-
Morris Jette authored
Without these time limits and without time limits on the partitions, the group usage limits become huge values and make validation of some qos/association limit tests confusing
-
Brian Christiansen authored
Bug 2601
-
Brian Christiansen authored
When using NO_NHC, the step's job ptr would be nulled out before signalling the tasks.
-
Janne Blomqvist authored
I noticed that the CpuFreqDef config option was only partially implemented. The value was parsed, but the never used. So I took the liberty of re-purposing it to mean sort of the opposite, namely the frequency governor to use when running a job step in case the job doesn't explicitly provide any --cpu-freq option. I also changed the default of the CpuFreqGovernors option to be "ondemand,performance", since ondemand isn't available with the intel_pstate driver. Otherwise the patch should be relatively straightforward and only changes a few minor things here and there.
-
Tim Wickberg authored
-
Tim Wickberg authored
a) setpgrp() swapped for equivalent setpgid(0, 0) b) define _GNU_SOURCE to unmask getline function definition in stdlib.h
-
- 16 Apr, 2016 1 commit
-
-
Morris Jette authored
The test was sensitive with respect to a batch step starting before requeuing the job. The batch step accounting record either appeared in the accounting records or did not depending upon timing. A sleep has been added after the job enters RUNNING state to make sure the batch steps starts and an accounting records is generated for it.
-
- 15 Apr, 2016 21 commits
-
-
Morris Jette authored
Include test ID in the account name to better identify where vestigial accounts come from.
-
Brian Christiansen authored
Coverity reported: CID 93013: Error handling issues (CHECKED_RETURN) "read(int, void *, size_t)" returns the number of bytes read, but it is ignored. umask() is also not thread-safe.
-
Thomas Hamel authored
While waiting for the HealthCheckProgram to succeed, slurmd can be stopped. The previous behavior introduced a delay up to 10 seconds between the shutdown request and the actual shutdown. This patch removes this delay.
-
Tim Wickberg authored
Intentially leave the key value fixed, rather than initialize from from /dev/urandom as is commonly recommended. Slurm does not rely on the hash function for any cryptographic functionality, and randomness would make debugging harder if the hash key changed on each start.
-
Brian Christiansen authored
Found by clang.
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
The job submit logic is not prepared to deal with deadline scheduling. If a job is submitted with a deadline, defer it's scheduler to the main scheduling loop or backfill scheduler, which has logic to manage deadlines.
-
Marlys Konhke authored
As part of the setup activity prior to invoking the CCM prologue on Cray native Slurm systems, the job prolog_running value is incremented and the job_state is OR'd with JOB_CONFIGURING. After the CCM prologue completes, these field changes are removed. That setup activity allows the CCM prologue to complete before the job launch continues. If the slurmctld is shutdown or killed while a CCM prologue is executing, those two job field changes can't be removed since slurmctld is no longer there. Clearing those field values is now handled during job recovery within the select/cray plugin select_p_job_init() procedure. If a job being recovered came from a CCM defined partition and if either of those two field values are still set as above, then the CCM prologue is run again. The CCM prologue handles being called more than once. The above field changes are then removed after this rerun CCM prologue completes. The CCM epilogue is not affected.
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
Marlys Konhke authored
-
Danny Auble authored
-
Marlys Konhke authored
-
Morris Jette authored
If a job was submitted with a deadline and no time_limit or min_time, but the system has a QOS MaxWall the job's time_limit would be set to the QOS limit. Since there is no min_time specified, the QOS MaxWall would be treated as a min and max time limit for the job and potentially make the deadline impossible to satisfy. Now we set the min_time to 1 minute of there is a deadline, but no time_limit or min_time.
-
Morris Jette authored
Also make sure the job is cancelled at the end of the test
-
Morris Jette authored
-
Morris Jette authored
Add TopologyParam option of "TopoOptional" to optimize network topology only for jobs requesting it. bug 2567
-
- 14 Apr, 2016 4 commits
-
-
Tim Wickberg authored
Timeout stalled transfers and cleanup related data structures. Default to wait five minutes since last update. Hook onto registration/ping message type to trigger cleanup in a minimally invasive manner. While here restructure certain functions to use list_* functions rather than iterate on the structures.
-
Tim Wickberg authored
Otherwise --mail-type=ALL will send an unexpected stage_out message back to the user. Bug 2541.
-
Tim Wickberg authored
Otherwise --mail-type=ALL will send an unexpected stage_out message back to the user. Bug 2541.
-
Morris Jette authored
-