- 24 Aug, 2017 1 commit
-
-
Alejandro Sanchez authored
Calling bit_unfmt() with a zero bit_size() bitmap leads to a later call to bit_nclear() with start=0 and stop=-1, leading to the ABRT. This scenario happened when cgroup.conf has ConstrainDevices=yes and task_cgroup_devices_create() tries to collect the GRES devices but gres_cpu_cnt=0, thus creating a p->cpus_bitmap = bit_alloc(gres_cpu_cnt); of zero size which is passed by argument to bit_unfmt(). gres_cpu_cnt is 0 because we have defined a gres.conf like this: Name=gpu Type=tesla File=/tmp/gres/tesla0 CPUs=0,1 Name=gpu Type=tesla File=/tmp/gres/tesla1 CPUs=0,1 Name=gpu Type=kepler File=/tmp/gres/kepler0 CPUs=2,3 Name=gpu Type=kepler File=/tmp/gres/kepler1 CPUs=2,3 but have no GresTypes nor GRES option in the slurm.conf / node config def. Bug 3974
-
- 23 Aug, 2017 1 commit
-
-
Alejandro Sanchez authored
Running slurmctld under valgrind while operating with jobcomp/elasticsearch reported the following bytes definitely lost: ==27403== 658 bytes in 1 blocks are definitely lost in loss record 301 of 342 ==27403== at 0x4C2FD4F: realloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==27403== by 0x2281B3: slurm_xrealloc (xmalloc.c:137) ==27403== by 0x22856A: makespace (xstring.c:114) ==27403== by 0x2285D0: _xstrcat (xstring.c:132) ==27403== by 0x228CE0: _xstrfmtcat (xstring.c:291) ==27403== by 0x83C5BCD: ??? ==27403== by 0x30A913: g_slurm_jobcomp_write (slurm_jobcomp.c:172) ==27403== by 0x18D8FC: job_completion_logger (job_mgr.c:13652) It turns out the generated buffer in slurm_jobcomp_log_record was xstrdup'ed to the corresponding job_node->serialized_job, but the originally generated buffer wasn't freed afterwards. The fix consists in change the transfer so that instead of xstrdup'ing the char * we just assign the pointer and NULL the buffer. The job_node->serialized_job was already xfree'd properly later when the job was indexed. Discovered while working on Bug 4065.
-
- 22 Aug, 2017 7 commits
-
-
Alejandro Sanchez authored
Otherwise the resulting URL may be invalid. Update documentation while here as well. Bug 4065.
-
Tim Shaw authored
Otherwise a race between threads in _check_node_status leads to a crash. Bug 4093.
-
Tim Wickberg authored
Modification of commit c7e6d864. Bug 4095.
-
Philip Kovacs authored
bug 4095
-
Philip Kovacs authored
bug 4095
-
Morris Jette authored
-
Philip Kovacs authored
Bug 4094
-
- 21 Aug, 2017 2 commits
-
-
Morris Jette authored
bug 4056
-
Alejandro Sanchez authored
Given a configuration with TopologyParam including Dragonfly option, if a job requested --switches count, the count timeout specified by either the job request or max_switch_wait SchedulerParameters was not respected. This was due to leaf_switch_count variable not being incremented in _eval_nodes_dfly() function when needed, as we do in _eval_nodes_topo(), the later being a execution path which already succeed to wait for the switch count timeout. Bug 4056
-
- 18 Aug, 2017 1 commit
-
-
Alejandro Sanchez authored
-
- 17 Aug, 2017 2 commits
-
-
Tim Wickberg authored
-
Morris Jette authored
Coverity CID 44649 Bug 4085
-
- 16 Aug, 2017 1 commit
-
-
Danny Auble authored
instead of local. Bug 3546
-
- 15 Aug, 2017 2 commits
-
-
Morris Jette authored
-
Morris Jette authored
-
- 14 Aug, 2017 3 commits
-
-
Morris Jette authored
-
Danny Auble authored
This reverts commit 00a691b9.
-
Morris Jette authored
-
- 11 Aug, 2017 7 commits
-
-
Dominik Bartkiewicz authored
-
Dominik Bartkiewicz authored
-
Danny Auble authored
This will allow dell's custom syscfg to work correctly. NOTE: Dell calls flat memory just memory. Bug 4034
-
Danny Auble authored
No code change, just moving existing code into a switch ready to handle multiple options. Bug 4034
-
Danny Auble authored
Add SystemType to knl_generic.conf for knl_generic in preparations for making KNL work on a Dell system. Add SystemType to knl_generic.conf. This is used to distinguish differences in vendors such as 'Dell'. Bug 4034
-
Danny Auble authored
Bug 4059
-
Dominik Bartkiewicz authored
-
- 10 Aug, 2017 1 commit
-
-
Danny Auble authored
-
- 08 Aug, 2017 1 commit
-
-
Tim Wickberg authored
-
- 07 Aug, 2017 4 commits
-
-
Jason Travis authored
Bug 4057.
-
Justin Lecher authored
Starting from glibc-2.25 the macros major and minor are only available from sys/sysmacros.h. This patch uses an autoconf macro to detect the location and includes the header accordingly. Bug 3982.
-
Danny Auble authored
-
Dominik Bartkiewicz authored
Bug 4019
-
- 04 Aug, 2017 6 commits
-
-
Morris Jette authored
truncation of core specification and not reserving the specified cores. Fixes Coverity CID 45174 and 45175 Bug 4053
-
Danny Auble authored
-
Artem Polyakov authored
matter in production, but in testing it can. Bug 4051
-
Danny Auble authored
-
Marshall Garey authored
Fix mysql plugin to correctly return parent limits for all children. Bug 4050
-
Danny Auble authored
the tree. Bug 4050
-
- 02 Aug, 2017 1 commit
-
-
Marshall Garey authored
Would fail when trying to create the clustername file because the StateSaveLocation path didn't exist yet. Bug 3988
-