Commits · 6ca60e71c027d28807e5ead17b548e405c229c87 · Manuel G. Marciani / ces_slurm_simulator

04 Sep, 2013 5 commits

Morris Jette authored Sep 04, 2013

For a system with hyperthreads and GRES bound to specific CPUs,
there was an error in assigning CPUs

6ca60e71

Fix for job requesting GRES on node that lacks any GRES · 775b6598

Morris Jette authored Sep 04, 2013

Previous logic would allocate job resources on such a node
rather than reject the job allocation on that node.

775b6598

Improve GRES support for CPU topology · 6f50943c

Morris Jette authored Sep 04, 2013

Previous logic would pick CPUs then
reject jobs that can not match GRES to the allocated CPUs. New logic first
filters out CPUs that can not use the GRES, next picks CPUs for the job,
and finally picks the GRES that best match those CPUs.
bug 410

6f50943c

select/cons_res: minor improvement in logging · 0f8ab79f
Morris Jette authored Sep 04, 2013

0f8ab79f
Rename variable bitmap to node_bitmap · de988d47
Morris Jette authored Sep 03, 2013
```
No change in logic, just rename a variable for greater clarity
```
de988d47

03 Sep, 2013 3 commits
- jobcomp/filetxt - Streamline the code · 25f78058
  Morris Jette authored Sep 03, 2013
```
Eliminate extra variable and avoid changing the contents of a string
Minor refactoring of commit fa0af103
```
  25f78058
- Fix some possibly NULL pointers reported by CLANG · fa0af103
  Morris Jette authored Sep 03, 2013
```
None of these have ever occurred, but these changes will harden Slurm
bug 403
```
  fa0af103
- Corrections to SUG13 agenda · 2cfe6f2b
  Morris Jette authored Sep 03, 2013
  
  2cfe6f2b
30 Aug, 2013 2 commits
- Validate permissions of key directories at slurmctld startup · 368671b5
  Morris Jette authored Aug 29, 2013
```
Report anything that is world writable.
```
  368671b5
- Quickstart admin guide: describe file permission better · 1434e79e
  Morris Jette authored Aug 29, 2013
```
Directories created for executables, libraries, etc. have permissions
based upon the umask, which may not be desired.
```
  1434e79e
29 Aug, 2013 5 commits
- remove last comment, it is documented in job_mgr.c as this... · ab64e75b
  Danny Auble authored Aug 29, 2013
```
/* Current code (<= 2.1) has it so we start the new
 * job with the next step id.  This could be used
 * when restarting to figure out which step the
 * previous run of this job stopped on. */
```
  ab64e75b
- When a job is requeued reset the step id's back to 0. · 7e7edfca
  Danny Auble authored Aug 29, 2013
  
  7e7edfca
- Add Armin Größlinger to contributor/team list · 167278d6
  Morris Jette authored Aug 29, 2013
  
  167278d6
- Enforce --ntasks-per-socket=1 job option when allocating by socket · 58dd480a
  Magnus Jonsson authored Aug 29, 2013
```
See
https://groups.google.com/forum/#!topic/slurm-devel/j4izr0L4w8w
```
  58dd480a
- Add FAQ about HA database configuration · 914ff706
  Morris Jette authored Aug 28, 2013
  
  914ff706
28 Aug, 2013 11 commits
- better fix · 60a37b54
  Danny Auble authored Aug 28, 2013
  
  60a37b54
- run autogen.sh · ba991542
  Danny Auble authored Aug 28, 2013
  
  ba991542
- Add include to remove issue with warnings if using -Werror · 8a2d0006
  Danny Auble authored Aug 28, 2013
  
  8a2d0006
- Fix for invalid memory reference · caa69594
  Morris Jette authored Aug 28, 2013
```
due to multiple free calls caused by job arrays submitted to
multiple partitions. The root cause is the job priority array
of the original job being re-used by the subsequent job array
entries. A similar problem that could be induced by the user
specifying a job accounting frequency when submitting a job
array is also fixed.
bug 401
```
  caa69594
- Fix for one-time memory leak at slurmctld shutdown time · dad13eec
  Morris Jette authored Aug 28, 2013
  
  dad13eec
- Various minor possible bug fixes as reported by clang tool · 64654579
  Morris Jette authored Aug 28, 2013
  
  64654579
- Remove vestigial function · a70212e7
  Morris Jette authored Aug 28, 2013
  
  a70212e7
- Make sure GrpCPURunMins is added when creating a user, account or QOS with · 2806f6d9
  Danny Auble authored Aug 28, 2013
```
sacctmgr.
```
  2806f6d9
- sjstat - Add man page when generating rpms. · 976747c6
  Danny Auble authored Aug 28, 2013
  
  976747c6
- Minor problems reported by clang tool · 437e948f
  Morris Jette authored Aug 27, 2013
```
Some uninitialized variables, possible NULL pointers, etc.
None of these have been seen in practice that we know of, but
these changes will bulletproof the code
```
  437e948f
- Fix possible NULL pointer usage in plugstack logic · 53128015
  Morris Jette authored Aug 27, 2013
```
Never observed, but "clang" tool reports these as possible failures
```
  53128015
27 Aug, 2013 10 commits
- Initialize variable to avoid warning · e980408c
  Morris Jette authored Aug 27, 2013
  
  e980408c
- Initialize a variable in acct_gather_energy/rapl plugin · 00f082ec
  Morris Jette authored Aug 27, 2013
  
  00f082ec
- Fix for uninitialized variable in partition restoration logic · dc2dfe90
  Morris Jette authored Aug 27, 2013
  
  dc2dfe90
- Prevent possible NULL memory reference · 87ef833d
  Morris Jette authored Aug 27, 2013
  
  87ef833d
- Prevent possible NULL memory reference · 5cc672d1
  Morris Jette authored Aug 27, 2013
  
  5cc672d1
- Avoid possible NULL memory reference · 088b3dfb
  Morris Jette authored Aug 27, 2013
  
  088b3dfb
- Avoid possible invalid memory reference · acc23d01
  Morris Jette authored Aug 27, 2013
  
  acc23d01
- Initialize a couple of variables in reservation logic · 2dbdaa04
  Morris Jette authored Aug 27, 2013
  
  2dbdaa04
- Reservation with CoreCnt: Avoid possible invalid memory reference · e0541f93
  Morris Jette authored Aug 27, 2013
```
If reservation create request included a CoreCnt value and more
nodes are required than configured, the logic in select/cons_res
could go off the end of the core_cnt array. This patch adds a
check for a zero value in the core_cnt array, which terminates
the user-specified array.
Back-port from master of commit 211c224b
```
  e0541f93
- Minor changes to Slurm logging for licenses · 26615813
  Morris Jette authored Aug 27, 2013
  
  26615813
26 Aug, 2013 1 commit
- Add --enable-multiple-slurmd to be an option in the slurm.spec file. · e5e98dd2
  Danny Auble authored Aug 26, 2013
  
  e5e98dd2
24 Aug, 2013 1 commit
- If running jobacct_gather/none fix issue on unpacking step completion. · 33ff8dbc
  Danny Auble authored Aug 23, 2013
  
  33ff8dbc
23 Aug, 2013 2 commits

Clarify equivalent sacct options in man page · 9cddcaf9
Morris Jette authored Aug 23, 2013

9cddcaf9

Correct value of min_nodes returned by loading job info · 98e24b0d

Morris Jette authored Aug 23, 2013

This is a correction of a bug introduced in commit
https://github.com/SchedMD/slurm/commit/ac44db862c8d1f460e55ad09017d058942ff6499
That commit eliminated the need of reading the node state information
from squeue for performance reasons (mostly for large parallel systems
in which the Prolog ran squeue, which generates a lot of simultaneous
RPCs, slowing down the job launch process). It also assumed 1 CPU per
node. If a pending job specified a node count of 1 and a task count
larger than one, squeue was reporting the node count of the job as
the same as the task count. This patch moves that same calculation
of a pending job's minimum node count into slurmctld, so the squeue
still does not need to read the node information, but can report the
correct node count for pending jobs with minimal overhead.

98e24b0d