- 29 Mar, 2016 4 commits
-
-
Morris Jette authored
-
Danny Auble authored
launching a job, instead the job will fail and drain the node if the env isn't loaded normally. bug 2546
-
Morris Jette authored
Some portitions of the reservation test assume the socket/core/thread counts are homogeneous and fail otherwise. bug 2597 added to address root cause of failure at a later time.
-
Brian Christiansen authored
Argument with 'nonnull' attribute passed null.
-
- 28 Mar, 2016 5 commits
-
-
Danny Auble authored
with the slurmdbd.
-
Danny Auble authored
make the wait to return data only hit after 500 nodes and configurable based on the TcpTimeout value.
-
Morris Jette authored
-
Morris Jette authored
There was a subtle bug in how tasks were bound to CPUs which could result in an "infinite loop" error. The problem was various socket/core/threasd calculations were based upon the resources allocated to a step rather than all resources on the node and rounding errors could occur. Consider for example a node with 2 sockets, 6 cores per socket and 2 threads per core. On the idle node, a job requesting 14 CPUs is submitted. That job would be allocted 4 cores on the first socket and 3 cores on the second socket. The old logic would get the number of sockets for the job at 2 and the number of cores at 7, then calculate the number of cores per socket at 7/2 or 3 (rounding down to an integer). The logic layouting out tasks would bind the first 3 cores on each socket to the job then not find any remaining cores, report the "infinite loop" error to the user, and run the job without one of the expected cores. The problem gets even worse when there are some allocated cores on a node. In a more extreme case, a job might be allocated 6 cores on one socket and 1 core on a second socket. In that case, 3 of that job's cores would be unused. bug 2502
-
Morris Jette authored
This is a revision to commit 1ed38f26 The root problem is that a pthread is passed an argument which is a pointer to a variable on the stack. If that variable is over-written, the signal number recieved will be garbage, and that bad signal number will be interpretted by srun to possible abort the request.
-
- 26 Mar, 2016 5 commits
-
-
Morris Jette authored
-
Morris Jette authored
This fixes tests when a cluster's node name format includes nodes with numeric sufficies and those without.
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
The previous commit obviously fixed a problem, but introduced a different set of problems. This will be pursued later, perhaps in version 16.05.
-
- 25 Mar, 2016 6 commits
-
-
Morris Jette authored
With some configurations and systems, errors of the following sort were occuring: task/cgroup: task[1] infinite loop broken while trying to provision compute elements using block task/cgroup: task[1] unable to set taskset '0x0'
-
Nathan Yee authored
Bug 1706
-
Morris Jette authored
-
Nathan Yee authored
bug 2070
-
Morris Jette authored
-
Morris Jette authored
burst_buffer/cray - If the pre-run operation fails then don't issue duplicate job cancel/requeue unless the job is still in run state. Prevents jobs hung in COMPLETING state. bug 2587
-
- 24 Mar, 2016 4 commits
-
-
Morris Jette authored
Running "scontrol reconfig" releases resources for jobs waiting for the completion of Node Health Check so that other jobs can run. Cray says to always wait for NHC to complete, but in extreme cases that can be 2 hours, during which the entire resource allocation for a job may be unusable. Per advice from NERSC, the logic to release resources is unchanged, but logging is added here.
-
Danny Auble authored
isn't kept up to date in the cache.
-
Danny Auble authored
-
Danny Auble authored
as will.
-
- 23 Mar, 2016 16 commits
-
-
Morris Jette authored
Conflicts: src/plugins/select/cons_res/job_test.c
-
Morris Jette authored
Fix gang scheduling resource selection bug which could prevent multiple jobs from being allocated the same resources. Bug was introduced in 15.08.6, commit 44f491b8
-
Tim Wickberg authored
-
Tim Wickberg authored
Also ensure empty (0-length) files are handled properly. Remove a stray exit(1) call from _rpc_file_bcast() to avoid slurmd exiting on malformed data.
-
Danny Auble authored
-
Morris Jette authored
Here's how to reproduce on smd-server with 2 sockets, 6 cores per socket and 2 threads per core, just run the following command line 3 times in quick succession (all active at the same time): srun --cpus-per-task=4 -m block sleep 30 What was happening is the first job would be allocated cores 0+1 The second job would be allocated cores 2+3 The thrid job would test use of cores 0-3 then exit because the job only needs 4 CPUs. The resulting core binding would include NO CPUs. The new logic tests that the core being considered for use actually has some resources available to the job before updating the counter which is being tested against the needed CPU counter.
-
Morris Jette authored
Specifically add the HWLOC_TOPOLOGY_FLAG_WHOLE_SYSTEM flag when loading configuration from HWLOC library. Previous logic in task/cgroup did not do this, which was different behaviour from how slurmd gets configuration information. Here's the HWLOC documentation: HWLOC_TOPOLOGY_FLAG_WHOLE_SYSTEM Detect the whole system, ignore reservations and offline settings. Gather all resources, even if some were disabled by the administrator. For instance, ignore Linux Cpusets and gather all processors and memory nodes, and ignore the fact that some resources may be offline. Without this flag, I was rarely observing a bad core count, which resulted in the logic layout out tasks wrong and generating an error: task/cgroup: task[0] infinite loop broken while trying to provision compute elements using cyclic bug 2502
-
Danny Auble authored
-
Danny Auble authored
-
Tim Wickberg authored
With bcast split into its own directory -lz should not be required throughout. This reverts commit e7981406.
-
Tim Wickberg authored
Remove unused struct and macro from file_bcast.h. Free file_bcast_info_t to prevent leak.
-
Danny Auble authored
-
Danny Auble authored
-
Tim Wickberg authored
1) Add a new global file_bcast_list to store info on in-progress file transfers, cache FD there rather than reopening the file for every block. 2) Restructure security mechanisms. First block will fork() and open the file, and pass the FD back to the thread. Thread then registers this file transfer in the file_bcast_list. Split fork() stuff into _file_bcast_register_file to keep _rpc_file_bcast readable. 3) Successive blocks are handled within the thread. Security is handled by matching uid and file name to existing file transfer. TODO: 1) Write transfer cleanup function to remove stalled transfers. 2) Use mmap for file output. 3) Allow for parallel block transfer. Current code assumes blocks will always arrive in order. Out of order blocks will result in corrupted output. (sbcast currently prevents this by requiring each message to be ack'd before continuing, but at a likely severe performance penalty.) 4) Add stats on receive side.
-
Tim Wickberg authored
This reverts commit 8c8c3407488fe3f0a552d2359ef5b487330ee8ba. Thread-only isn't portable, need to use fork() on first block to ensure file security and containers are handled correctly.
-
Tim Wickberg authored
-