Commits · 97c91cc5f38c56e56875c264be95cbd0bd4d8759 · Manuel G. Marciani / ces_slurm_simulator

29 Mar, 2016 5 commits
- Add FAQ for slurmdbd -R option · 97c91cc5
  Morris Jette authored Mar 29, 2016
```
This adds a FAQ to go with commit 8ee976b4
```
  97c91cc5
- Harden test for "stale file handle" error · 572ba574
  Morris Jette authored Mar 29, 2016
  
  572ba574
- Add SchedulerParameter no_env_cache, if set no env cache will be use when · 1f72fbe9
  Danny Auble authored Mar 29, 2016
```
launching a job, instead the job will fail and drain the node if the env
isn't loaded normally.

bug 2546
```
  1f72fbe9
- In reservation test, log heterogeneous cluster may cause test failure · d8f55ff1
  Morris Jette authored Mar 29, 2016
```
Some portitions of the reservation test assume the socket/core/thread
counts are homogeneous and fail otherwise. bug 2597 added to address
root cause of failure at a later time.
```
  d8f55ff1
- Fix clang issue. · 810971e2
  Brian Christiansen authored Mar 28, 2016
```
Argument with 'nonnull' attribute passed null.
```
  810971e2
28 Mar, 2016 5 commits

Add functionality to reset the lft and rgt values of the association table · 8ee976b4
Danny Auble authored Mar 28, 2016
```
with the slurmdbd.
```
8ee976b4
When a stepd is about to shutdown and send it's response to srun · ea470f71
Danny Auble authored Mar 28, 2016
```
make the wait to return data only hit after 500 nodes and configurable
based on the TcpTimeout value.
```
ea470f71
Merge branch 'slurm-15.08' · 2d70778b
Morris Jette authored Mar 28, 2016

2d70778b

task/cgroup - Fix task binding to CPUs bug · ddf6d9a4

Morris Jette authored Mar 28, 2016

There was a subtle bug in how tasks were bound to CPUs which could result
in an "infinite loop" error. The problem was various socket/core/threasd
calculations were based upon the resources allocated to a step rather than
all resources on the node and rounding errors could occur. Consider for
example a node with 2 sockets, 6 cores per socket and 2 threads per core.
On the idle node, a job requesting 14 CPUs is submitted. That job would
be allocted 4 cores on the first socket and 3 cores on the second socket.
The old logic would get the number of sockets for the job at 2 and the
number of cores at 7, then calculate the number of cores per socket at
7/2 or 3 (rounding down to an integer). The logic layouting out tasks
would bind the first 3 cores on each socket to the job then not find any
remaining cores, report the "infinite loop" error to the user, and run
the job without one of the expected cores. The problem gets even worse
when there are some allocated cores on a node. In a more extreme case,
a job might be allocated 6 cores on one socket and 1 core on a second
socket. In that case, 3 of that job's cores would be unused.
bug 2502

ddf6d9a4

Fix for srun signal handling threading problem · c8d36dba

Morris Jette authored Mar 28, 2016

This is a revision to commit 1ed38f26
The root problem is that a pthread is passed an argument which is
a pointer to a variable on the stack. If that variable is over-written,
the signal number recieved will be garbage, and that bad signal
number will be interpretted by srun to possible abort the request.

c8d36dba

26 Mar, 2016 5 commits
- Fix test if "." not in search path · c86b143c
  Morris Jette authored Mar 25, 2016
  
  c86b143c
- Fix some tests for odd host names · 21b25847
  Morris Jette authored Mar 25, 2016
```
This fixes tests when a cluster's node name format includes nodes
  with numeric sufficies and those without.
```
  21b25847
- Harden test for node names without numeric suffix · 0cda6daf
  Morris Jette authored Mar 25, 2016
  
  0cda6daf
- Merge branch 'slurm-15.08' · af7a6fd3
  Morris Jette authored Mar 25, 2016
  
  af7a6fd3
- Revert commit efa83a02 · c1dde86c
  Morris Jette authored Mar 25, 2016
```
The previous commit obviously fixed a problem, but introduced a different
set of problems. This will be pursued later, perhaps in version 16.05.
```
  c1dde86c
25 Mar, 2016 6 commits
- Revert commit 6c14b969 · f5920b77
  Morris Jette authored Mar 25, 2016
```
With some configurations and systems, errors of the following sort were
occuring:
task/cgroup: task[1] infinite loop broken while trying to provision compute elements using block
task/cgroup: task[1] unable to set taskset '0x0'
```
  f5920b77
- Add "sacctmgr lost jobs" to report orphaned jobs on clsuter. · 2dd920b9
  Nathan Yee authored Mar 25, 2016
```
Bug 1706
```
  2dd920b9
- Add OR constraint option test · b3dee298
  Morris Jette authored Mar 25, 2016
  
  b3dee298
- Add test of exclusive OR constraints · 59899e2d
  Nathan Yee authored Mar 25, 2016
```
bug 2070
```
  59899e2d
- Merge branch 'slurm-15.08' · bd39fef3
  Morris Jette authored Mar 25, 2016
  
  bd39fef3
- burst_buffer/cray - pre-run fail fix · 5a48207e
  Morris Jette authored Mar 25, 2016
```
burst_buffer/cray - If the pre-run operation fails then don't issue
    duplicate job cancel/requeue unless the job is still in run state. Prevents
    jobs hung in COMPLETING state.
bug 2587
```
  5a48207e
24 Mar, 2016 4 commits

Select/cray - Log NHC run time on "scontrol reconfig" · 58627d02

Morris Jette authored Mar 24, 2016

Running "scontrol reconfig" releases resources for jobs waiting for
  the completion of Node Health Check so that other jobs can run.
  Cray says to always wait for NHC to complete, but in extreme
  cases that can be 2 hours, during which the entire resource
  allocation for a job may be unusable. Per advice from NERSC,
  the logic to release resources is unchanged, but logging is
  added here.

58627d02

Remove Rgt from the association output of scontrol show assoc_mgr. Rgt · ddfd2781
Danny Auble authored Mar 24, 2016
```
isn't kept up to date in the cache.
```
ddfd2781
Fix incorrect info from commit ce6d3717 · 60979f52
Danny Auble authored Mar 24, 2016

60979f52
Add new expect get_curr_line_num so you can print out the script line · ce6d3717
Danny Auble authored Mar 24, 2016
```
as will.
```
ce6d3717

23 Mar, 2016 15 commits

Merge branch 'slurm-15.08' · 3028cfea
Morris Jette authored Mar 23, 2016
```
Conflicts:
	src/plugins/select/cons_res/job_test.c
```
3028cfea

gang scheduling bug fix · 5f1e78f6

Morris Jette authored Mar 23, 2016

Fix gang scheduling resource selection bug which could prevent multiple jobs
    from being allocated the same resources. Bug was introduced in 15.08.6,
    commit 44f491b8

5f1e78f6

Fix leak of out_buf when file length is zero. · 498624df
Tim Wickberg authored Mar 23, 2016

498624df

Cleanup Coverity errors from file_bcast work. · 54c9ac31

Tim Wickberg authored Mar 23, 2016

Also ensure empty (0-length) files are handled properly.
Remove a stray exit(1) call from _rpc_file_bcast() to avoid
slurmd exiting on malformed data.

54c9ac31

Propagate use of functions added in commits 2ed5c7fb and e7f12058 · 10824898
Danny Auble authored Mar 23, 2016

10824898

task/cgroup: Fix for task binding anomaly · efa83a02

Morris Jette authored Mar 23, 2016

Here's how to reproduce on smd-server with 2 sockets, 6 cores per
socket and 2 threads per core, just run the following command line
3 times in quick succession (all active at the same time):
srun --cpus-per-task=4 -m block sleep 30
What was happening is the first job would be allocated cores 0+1
The second job would be allocated cores 2+3
The thrid job would test use of cores 0-3 then exit because the
 job only needs 4 CPUs. The resulting core binding would include
 NO CPUs. The new logic tests that the core being considered for
 use actually has some resources available to the job before
 updating the counter which is being tested against the needed
 CPU counter.

efa83a02

task/cgroup: Fix for task layout logic when disabled resources. · 6c14b969

Morris Jette authored Mar 23, 2016

Specifically add the HWLOC_TOPOLOGY_FLAG_WHOLE_SYSTEM flag when
loading configuration from HWLOC library. Previous logic in
task/cgroup did not do this, which was different behaviour from
how slurmd gets configuration information. Here's the HWLOC
documentation:
HWLOC_TOPOLOGY_FLAG_WHOLE_SYSTEM
Detect the whole system, ignore reservations and offline settings.
Gather all resources, even if some were disabled by the administrator.
For instance, ignore Linux Cpusets and gather all processors and memory
nodes, and ignore the fact that some resources may be offline.

Without this flag, I was rarely observing a bad core count, which
resulted in the logic layout out tasks wrong and generating an error:
task/cgroup: task[0] infinite loop broken while trying to provision compute elements using cyclic

bug 2502

6c14b969

Continuation of commit 2ed5c7fb to work for accounts · e7f12058
Danny Auble authored Mar 23, 2016

e7f12058
Merge remote-tracking branch 'origin/slurm-15.08' · f1ef24e6
Danny Auble authored Mar 23, 2016

f1ef24e6

Revert "Fix expect tests that expect -lz for compilation." · 04c395f5

Tim Wickberg authored Mar 23, 2016

With bcast split into its own directory -lz should not be
required throughout.

This reverts commit e7981406.

04c395f5

Send file_size across as part of the RPC, will be needed for mmap. · ee826b96
Tim Wickberg authored Mar 23, 2016
```
Remove unused struct and macro from file_bcast.h.

Free file_bcast_info_t to prevent leak.
```
ee826b96
Cleanup new fields added to the assoc_mgr cache. · 7e2e8f88
Danny Auble authored Mar 22, 2016

7e2e8f88
Simplify test since all the lines were the same. · 2ab694cb
Danny Auble authored Mar 22, 2016

2ab694cb

Restructure file_bcast mechanism to fork only on first block. · 7bac612c

Tim Wickberg authored Mar 21, 2016

1) Add a new global file_bcast_list to store info on in-progress file
   transfers, cache FD there rather than reopening the file for every block.
2) Restructure security mechanisms. First block will fork() and open the
   file, and pass the FD back to the thread. Thread then registers this file
   transfer in the file_bcast_list. Split fork() stuff into
   _file_bcast_register_file to keep _rpc_file_bcast readable.
3) Successive blocks are handled within the thread. Security is handled by
   matching uid and file name to existing file transfer.

TODO:

1) Write transfer cleanup function to remove stalled transfers.
2) Use mmap for file output.
3) Allow for parallel block transfer. Current code assumes blocks will always
   arrive in order. Out of order blocks will result in corrupted output.
   (sbcast currently prevents this by requiring each message to be ack'd
   before continuing, but at a likely severe performance penalty.)
4) Add stats on receive side.

7bac612c

Revert "Handle sbcast output within the RPC thread instead of fork()'ing." · 90206f27

Tim Wickberg authored Mar 21, 2016

This reverts commit 8c8c3407488fe3f0a552d2359ef5b487330ee8ba.

Thread-only isn't portable, need to use fork() on first block to ensure
file security and containers are handled correctly.

90206f27