- 30 Nov, 2016 3 commits
-
-
Morris Jette authored
cray/burst_buffer - Increase time to synchronize operations between threads from 5 to 60 seconds ("setup" operation time observed over 17 seconds). This should fix a race condition between a thread performing a buffer creation (setup) and a thread looking for unexpected buffers. If a buffer is found during the time window allowed for creation, it's space will be counted twice. First by the status checking thread and second by the thread doing the creation. The deallocation only happens once, so the used space information can be left with an invalid value. bug 3295
-
Tim Wickberg authored
Never used and is uninitialized making backtraces more confusing. Fix whitespace in bcast_parameters struct while here. No functional change.
-
Tim Wickberg authored
static variable means multiple active decompression streams will corrupt zlib's internal state, which can lead to a segfault. Bug 3299.
-
- 29 Nov, 2016 4 commits
-
-
Alejandro Sanchez authored
On a reconfig, the exc_node_bitmap is cleared but then it was not built again since last_work_scan was declared as a local static variable in _do_power_work(). The fix is to make it global within the plugin and reinitialize it to 0 on _init_power_config(). Bug 3078.
-
Tim Wickberg authored
-
Tim Wickberg authored
-
Tim Wickberg authored
-
- 28 Nov, 2016 4 commits
-
-
Alejandro Sanchez authored
-
Tim Wickberg authored
-
Dominik Bartkiewicz authored
Bug 3267.
-
Dominik Bartkiewicz authored
Termination can race against step creation if, e.g., ill-behaved SPANK plugins are in use. Bug 3248.
-
- 23 Nov, 2016 1 commit
-
-
Danny Auble authored
-
- 22 Nov, 2016 5 commits
-
-
Morris Jette authored
sched/backfill plugin: Make malloc match data type (defined as uint32_t and allocated as int). No failures observed, if type "int" is smaller than "uint32_t", it could result in an invalid memory reference.
-
Sergey Meirovich authored
Fix API call: slurm_job_cpus_allocated_str_on_node_id() and in turn slurm_job_cpus_allocated_str_on_node() to return correct results for anything but first node. This was caused by missed logic to calculate fist bit belongs to particular node. Lookup was always starting from bit 0. Bug 3266.
-
Morris Jette authored
After one second of wall time, simulate the termination of all remaining running jobs in order to respond in a reasonable time frame. bug 3275
-
Morris Jette authored
Modify backfill algorithm to improve performance with large numbers of running jobs. Group running jobs that end in a "similar" time frame using a time window that grows exponentially rather than linearly. The original window sizes were (in units of minutes): 0, 1, 2, 3, 4, 5, 6, 7, ... minutes The new window sizes are: 0.5, 1, 2, 4, 8, 16, 32, ... minutes This can dramatically reduce the number of instances where the very time consuming "can the pending job run now" operation is executed, especailly if there are 1000+ running jobs. bug 3275
-
Nicolas Joly authored
-
- 20 Nov, 2016 1 commit
-
-
Morris Jette authored
-
- 15 Nov, 2016 1 commit
-
-
Tim Wickberg authored
Prevent a scrollbar from appearing on the SchedMD logo in the top left.
-
- 14 Nov, 2016 5 commits
-
-
Morris Jette authored
If a node is booting for some job, don't allocate additional jobs to the node until the boot completes. but 3256
-
Danny Auble authored
-
Danny Auble authored
doing an upgrade. It isn't advised. Do one then the other. Basically if you are using the mysql plugin make sure you add the cluster to the system as the mysql plugin doesn't do that explicitly. Bug 3131
-
Brian Christiansen authored
-
Danny Auble authored
and you wouldn't be able to read anything after the cut.
-
- 13 Nov, 2016 2 commits
-
-
Alejandro Sanchez authored
Found with valgrind. Bug 2846.
-
Danny Auble authored
-
- 12 Nov, 2016 1 commit
-
-
Danny Auble authored
-
- 11 Nov, 2016 13 commits
-
-
Morris Jette authored
Move where we set the configuration table bitmaps in order to support the backup slurmctld starting and recovering previously saved KNL mode information (which can necessitate rebuilding the node configuration table). bug 3241
-
Danny Auble authored
-
Tim Wickberg authored
Bug 3255.
-
Tim Wickberg authored
-
Morris Jette authored
-
Alejandro Sanchez authored
No functional change. Bug 3237.
-
Tim Wickberg authored
-
Tim Wickberg authored
-
Tim Wickberg authored
Page needs to display properly even if the CDN gets rid of this, or for machines with no internet connectivity. Fix a few minor style issues, and remove some unused javascript.
-
Tim Wickberg authored
Change header and footer over to new design, switch around the css files, and adjust the build system to match. Design by Grant Zabriskie.
-
Tim Wickberg authored
Stop bundling the file in docs (will be preserved on SchedMD site, and is in git), and unlink from build.
-
Tim Wickberg authored
-
Tim Wickberg authored
-