- 20 Apr, 2016 1 commit
-
-
Tim Wickberg authored
a) setpgrp() swapped for equivalent setpgid(0, 0) b) define _GNU_SOURCE to unmask getline function definition in stdlib.h
-
- 16 Apr, 2016 1 commit
-
-
Morris Jette authored
The test was sensitive with respect to a batch step starting before requeuing the job. The batch step accounting record either appeared in the accounting records or did not depending upon timing. A sleep has been added after the job enters RUNNING state to make sure the batch steps starts and an accounting records is generated for it.
-
- 15 Apr, 2016 21 commits
-
-
Morris Jette authored
Include test ID in the account name to better identify where vestigial accounts come from.
-
Brian Christiansen authored
Coverity reported: CID 93013: Error handling issues (CHECKED_RETURN) "read(int, void *, size_t)" returns the number of bytes read, but it is ignored. umask() is also not thread-safe.
-
Thomas Hamel authored
While waiting for the HealthCheckProgram to succeed, slurmd can be stopped. The previous behavior introduced a delay up to 10 seconds between the shutdown request and the actual shutdown. This patch removes this delay.
-
Tim Wickberg authored
Intentially leave the key value fixed, rather than initialize from from /dev/urandom as is commonly recommended. Slurm does not rely on the hash function for any cryptographic functionality, and randomness would make debugging harder if the hash key changed on each start.
-
Brian Christiansen authored
Found by clang.
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
The job submit logic is not prepared to deal with deadline scheduling. If a job is submitted with a deadline, defer it's scheduler to the main scheduling loop or backfill scheduler, which has logic to manage deadlines.
-
Marlys Konhke authored
As part of the setup activity prior to invoking the CCM prologue on Cray native Slurm systems, the job prolog_running value is incremented and the job_state is OR'd with JOB_CONFIGURING. After the CCM prologue completes, these field changes are removed. That setup activity allows the CCM prologue to complete before the job launch continues. If the slurmctld is shutdown or killed while a CCM prologue is executing, those two job field changes can't be removed since slurmctld is no longer there. Clearing those field values is now handled during job recovery within the select/cray plugin select_p_job_init() procedure. If a job being recovered came from a CCM defined partition and if either of those two field values are still set as above, then the CCM prologue is run again. The CCM prologue handles being called more than once. The above field changes are then removed after this rerun CCM prologue completes. The CCM epilogue is not affected.
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
Marlys Konhke authored
-
Danny Auble authored
-
Marlys Konhke authored
-
Morris Jette authored
If a job was submitted with a deadline and no time_limit or min_time, but the system has a QOS MaxWall the job's time_limit would be set to the QOS limit. Since there is no min_time specified, the QOS MaxWall would be treated as a min and max time limit for the job and potentially make the deadline impossible to satisfy. Now we set the min_time to 1 minute of there is a deadline, but no time_limit or min_time.
-
Morris Jette authored
Also make sure the job is cancelled at the end of the test
-
Morris Jette authored
-
Morris Jette authored
Add TopologyParam option of "TopoOptional" to optimize network topology only for jobs requesting it. bug 2567
-
- 14 Apr, 2016 17 commits
-
-
Tim Wickberg authored
Timeout stalled transfers and cleanup related data structures. Default to wait five minutes since last update. Hook onto registration/ping message type to trigger cleanup in a minimally invasive manner. While here restructure certain functions to use list_* functions rather than iterate on the structures.
-
Tim Wickberg authored
Otherwise --mail-type=ALL will send an unexpected stage_out message back to the user. Bug 2541.
-
Tim Wickberg authored
Otherwise --mail-type=ALL will send an unexpected stage_out message back to the user. Bug 2541.
-
Morris Jette authored
-
Janne Blomqvist authored
Siphash is a state of the art keyed hash function that is performance competitive with the usual non-cryptographic hash functions. It's used as the default hash function backing hash tables in e.g. Perl, Python, Rust, and so on. Here we initially use it for the gid cache hash table, and in the common xhash implementation.
-
Jean-Philippe Aumasson authored
-
Danny Auble authored
sacctmgr list events
-
Tim Wickberg authored
step_ptr->job_ptr is already dereferenced several times by now, so null check is unnecessary here.
-
Brian Christiansen authored
-
Morris Jette authored
Conflicts: NEWS src/plugins/accounting_storage/mysql/as_mysql_resv.c
-
Morris Jette authored
If a job fails stage in, set its reason to BurstBufferOperation with a string describing what happened. Previously the reason was set to AdminHeld on stage-in failure.
-
Brian Christiansen authored
For commits: f980c588 510abf23
-
Brian Christiansen authored
-
Brian Christiansen authored
-
Brian Christiansen authored
-
Brian Christiansen authored
MB for memory and bb.
-
Brian Christiansen authored
Bug 1783
-