- 17 Sep, 2017 8 commits
-
-
Tim Wickberg authored
In slurmctld: - Handle the lookup, and send results across. - Periodically flush the cache. In slurmd: - Load value sent as part of prolog into cache - Have task launch use the new call instead to fetch the jobid-cached values when available - Remove old job-specific records from the cache as part of the prolog. - Ignore the slowly increasing memory footprint from the cache, as this is what the old implementation did anyways. It'll be O(users in cluster) at worst, and flushes on a reconfigure. - Simplify the slurmstepd communication path. group_cache_lookup will always succeed. Left to do: - Convert other lookup locations in slurmd to the new approach, and remove the compatibility shim. Bug 3322.
-
Tim Wickberg authored
Loosely based on src/slurmd/req.c. But with major alterations in preparation for use with PrologFlags=SendGIDs. Convert slurmd to use a compatibility shim in here for now.
-
Tim Wickberg authored
Bug 3322.
-
Tim Wickberg authored
The slurmd is always sending the ngid / gids over the pipe, so the setgroups() call is all that matters here. Move it out to _drop_priviledges instead.
-
Tim Wickberg authored
Bug 3322.
-
Tim Wickberg authored
Patch was broken by a rebase before pushing. Fixing as a clean patch. This reverts commit b0838485.
-
Tim Wickberg authored
-
Tim Wickberg authored
Regression caught my slight oversight from 88d5a801.
-
- 16 Sep, 2017 5 commits
-
-
Tim Wickberg authored
-
Tim Wickberg authored
Bug 3322.
-
Tim Wickberg authored
Always false in all supported plugins, except for BGQ. Prior commit removed all calling paths.
-
Tim Wickberg authored
mpi_hook_client_single_task_per_node() is always false, except on BGQ systems. Move special handling code into an #ifdef so it can be readily identified and removed when BGQ support is retired.
-
Tim Wickberg authored
CI 177110.
-
- 15 Sep, 2017 27 commits
-
-
Tim Wickberg authored
-
Marshall Garey authored
Otherwise the step count comes back incorrectly. Bug 4157.
-
David Gloe authored
Bug 4147.
-
Tim Wickberg authored
Was used with Moab integration, which was removed in 17.11.
-
Tim Wickberg authored
Adding this cast trick prevents a "the address of 'id' will always evaluate as 'true' warning message from GCC.
-
Danny Auble authored
-
Alejandro Sanchez authored
==9617== 42 bytes in 1 blocks are definitely lost in loss record 3,654 of 7,443 ==9617== at 0x4C2FB45: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==9617== by 0x1A4A81: slurm_xmalloc (xmalloc.c:84) ==9617== by 0x1A505C: makespace (xstring.c:102) ==9617== by 0x1A51A6: _xstrcat (xstring.c:133) ==9617== by 0x1A58B6: _xstrfmtcat (xstring.c:292) ==9617== by 0x3181D7: parse_resv_nodecnt (state_control.c:213) ==9617== by 0x17611A: _set_resv_msg (resv_info.c:352) ==9617== by 0x17650A: _admin_focus_out_resv (resv_info.c:472) ==9617== 136 bytes in 1 blocks are definitely lost in loss record 6,082 of 7,443 ==9617== at 0x4C2DB2F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==9617== by 0x63FF80F: __vasprintf_chk (vasprintf_chk.c:80) ==9617== by 0x5C3E0D8: g_vasprintf (in /lib/x86_64-linux-gnu/libglib-2.0.so.0.5200.0) ==9617== by 0x5C18BAC: g_strdup_vprintf (in /lib/x86_64-linux-gnu/libglib-2.0.so.0.5200.0) ==9617== by 0x5C18C68: g_strdup_printf (in /lib/x86_64-linux-gnu/libglib-2.0.so.0.5200.0) ==9617== by 0x175F7A: _set_resv_msg (resv_info.c:306) ==9617== by 0x17650A: _admin_focus_out_resv (resv_info.c:472) ==9617== by 0x4F6B92B: ??? (in /usr/lib/x86_64-linux-gnu/libgtk-x11-2.0.so.0.2400.31) Bug 2329.
-
Danny Auble authored
-
Alejandro Sanchez authored
Bug 2329.
-
Alejandro Sanchez authored
This reverts commit 8ebe5f2ca0bd8b241273094f820f63f5d32a8752.
-
Alejandro Sanchez authored
Bug 2329
-
Alejandro Sanchez authored
Bug 2329
-
Alejandro Sanchez authored
Bug 2329
-
Alejandro Sanchez authored
-
Alejandro Sanchez authored
sview Watts parsing and display was introduced in commit a503a770. There were differences between scontrol and sview, so this commit moves the way we parse and display reservation Watts to a common place and make it so sview and scontrol use these functions. The parse and display functions have been slightly refactored as compared to the original ones. Bug 2329
-
Alejandro Sanchez authored
-
Alejandro Sanchez authored
Now that _parse_resv_tres is already available in src/common/state_control, we can enable an EDIT_TEXTBOX for the SORTID_TRES and call the function there. This is the original goal of the bug. Bug 2329
-
Alejandro Sanchez authored
And make src/scontrol/create_res.c use it. Bug 2329
-
Alejandro Sanchez authored
Bug 2329
-
Alejandro Sanchez authored
Bug 2329
-
Alejandro Sanchez authored
Bug 2329
-
Alejandro Sanchez authored
Testing with valgrind I found some cases where memory leaked or program got SIGSEV. This was due to bad memory tracking of these resv_desc_msg_t members. Some example requests that now end up with the expected behavior and no memory errors nor leaked memory: scontrol create reservationname=test start=now end=now+1day users=alex ... - nodecnt=2 - nodecnt=badvalue - nodecnt=2 tres=node=1 - nodecnt=2 tres=node=badvalue - corecnt=1 - corecnt=badvalue - corecnt=1 tres=cpu=2 - corecnt=1 tres=cpu=badvalue sview reservation creation/update. Ideally we should use slurm_free_resv_desc_msg() instead of tracking each resv_desc_msg_t member separately, but this function unconditionally xfrees each member, even if they have a value but memory was not allocated, thus receiving a SIGABRT. Bug 2329
-
Alejandro Sanchez authored
resv_msg_ptr->core_cnt memory needed to be freed before returning, since it was previously allocated by xrealloc(). Analogous to commit 8063a08cbf2f.
-
Alejandro Sanchez authored
Bug 2329
-
Alejandro Sanchez authored
_parse_resv_core_cnt has been moved to src/common/state_control.[ch]. Furthermore, it has been split in two functions: - _parse_resv_core_cnt - _is_corecnt_supported Bug 2329
-
Alejandro Sanchez authored
Bug 2329
-
Alejandro Sanchez authored
Bug was introduced in 22c9e25c. Stack trace: Thread 1 (Thread 0x7ff64228c300 (LWP 6652)): ../sysdeps/unix/sysv/linux/raise.c:58 fmt=fmt@entry=0x7ff6409bb000 "*** Error in `%s': %s: 0x%s ***\n") at ../sysdeps/posix/libc_fatal.c:175 ptr=<optimized out>, str=0x7ff6409bb110 "double free or corruption (out)", action=3) at malloc.c:5048 have_lock=<optimized out>) at malloc.c:3904 malloc.c:2984 new_text=0x561a4f8b4d10 "asfasdf", column=21) at ../../../../slurm/src/sview/resv_info.c:331 event=0x561a4f8a8f10, resv_msg=0x561a4f8b1410) at ../../../../slurm/src/sview/resv_info.c:515 Bug 2329
-