- 04 Feb, 2016 7 commits
-
-
Morris Jette authored
If a user specifies KNL features in the configuration, strip them out so that we use only the information from capmc
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
Move reading of knl.conf from src/common/read_config.c into src/plugins/node_features/knl_cray/node_features_knl_cray.c with minimal changes to logic
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
First phase of adding node_features plugin infrastructure (taken largely from knl plugin infrastructure). Added node_features/knl_cray plugin.
-
- 03 Feb, 2016 11 commits
-
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
-
Yu Watanabe authored
bug 2407
-
Yu Watanabe authored
bug 2408
-
Yu Watanabe authored
qalter and qrerun have wrong end-of-line encoding: \r\n rather than \n. To fix it, please run sed -e 's/\r$//' -i contribs/torque/qalter.pl sed -e 's/\r$//' -i contribs/torque/qrerun.pl Note that these two files have different permission (644) compare to other .pl files (755). bug 2409
-
Danny Auble authored
-
Morris Jette authored
-
Morris Jette authored
-
Yu Watanabe authored
-
Danny Auble authored
-
- 02 Feb, 2016 15 commits
-
-
Morris Jette authored
Change some power capping message from debug3 to debug5 if powercapping has nothing to do.
-
Morris Jette authored
-
Morris Jette authored
Set a job's --test-only start time and resources functioning properly with node active/available feature information (i.e. use active features if possible and fall-back to whole nodes and available features, the whole node flag is needed to insure we can reboot the node and change its active features).
-
Tim Wickberg authored
Use PRIu64 instead of %ld for uint64_t types (libsh5util_old), %zu instead of PRIu64 for size_t.
-
Tim Wickberg authored
-
Tim Wickberg authored
-
Yair Yarom authored
Add macros to fix compatibility for IPv6, deal with lack of POLLRDHUP on FreeBSD. Handle difference in cpu affinity APIs in gres.c.
-
Tim Wickberg authored
-
Tim Wickberg authored
-
Tim Wickberg authored
$((10#$SLURM_API_MAJOR)) is bash-specific. replace with portable ${SLURM_API_MAJOR#0} which accomplishes the same thing. The first forces bash to treat the value as base-10 even with a leading zero, the second portable format strips a leading zero off.
-
Tim Wickberg authored
Also remove checks for sys/termios.h from build system. Slurm directly includes the POSIX-required <termios.h> already, and the one use of this conditional is being removed here. Fixes one of several build errors on FreeBSD.
-
Morris Jette authored
-
Morris Jette authored
This fixes a bug introduced in commit a801d264 Whole node resource allocations with REPLACE option were not working. Detected by test3.14 failure.
-
Didier GAZEN authored
Support AuthInfo in slurmdbd.conf that is different from the value in slurm.conf. There is a possible bug in the slurm_get_auth_info function (src/common/slurm_protocol_api.c) that can cause the slurmdbd daemon to look for the AuthInfo parameter in slurm.conf instead of slurmdbd.conf when the auth/munge authentication method is used (AuthType=auth/munge). Here is the slurmdbd log revealing the problem (debug5() printing were added in the sources) : slurmdbd: slurmdbd version 15.08.7 started slurmdbd: debug2: running rollup at Tue Feb 02 14:20:14 2016 slurmdbd: debug5: in ../../../src/slurmdbd/slurmdbd.c, _send_slurmctld_register_req (line 690) slurmdbd: debug5: in ../../../src/common/slurm_protocol_api.c, slurm_send_node_msg (line 3601) slurmdbd: debug5: in ../../../../../src/plugins/auth/munge/auth_munge.c, slurm_auth_create (line 217) slurmdbd: debug5: in ../../../src/common/slurm_protocol_api.c, slurm_get_auth_ttl (line 1732) slurmdbd: debug5: Entering ../../../src/common/slurm_protocol_api.c, slurm_get_auth_info slurmdbd: debug: Reading slurm.conf file: /usr/local/slurm-15-08-7-1/etc/slurm.conf slurmdbd: error: s_p_parse_file: unable to status file /usr/local/slurm-15-08-7-1/etc/slurm.conf: No such file or directory, retrying in 1sec up to 60sec ... Then 60 seconds later, the auth_info value returned by slurm_get_auth_info is NULL: slurmdbd: debug5: Leaving ../../../src/common/slurm_protocol_api.c, slurm_get_auth_info, auth_info=(null) and slurmdbd continues without crashing, but I am not sure it is in a safe state. When applying this patch : diff --git a/src/common/slurm_protocol_api.c b/src/common/slurm_protocol_api.c index c5db879..be1dab6 100644 --- a/src/common/slurm_protocol_api.c +++ b/src/common/slurm_protocol_api.c @@ -1703,9 +1703,13 @@ extern char *slurm_get_auth_info(void) char *auth_info; slurm_ctl_conf_t *conf; - conf = slurm_conf_lock(); - auth_info = xstrdup(conf->authinfo); - slurm_conf_unlock(); + if (slurmdbd_conf) { + auth_info = xstrdup(slurmdbd_conf->auth_info); + } else { + conf = slurm_conf_lock(); + auth_info = xstrdup(conf->authinfo); + slurm_conf_unlock(); + } return auth_info; } the auth_info value is now valid and consistent with the slurmdbd.conf setting: slurmdbd: slurmdbd version 15.08.7 started slurmdbd: debug2: running rollup at Tue Feb 02 14:47:37 2016 slurmdbd: debug5: in ../../../src/slurmdbd/slurmdbd.c, _send_slurmctld_register_req (line 690) slurmdbd: debug5: in ../../../src/common/slurm_protocol_api.c, slurm_send_node_msg (line 3600) slurmdbd: debug5: in ../../../../../src/plugins/auth/munge/auth_munge.c, slurm_auth_create (line 217) slurmdbd: debug5: in ../../../src/common/slurm_protocol_api.c, slurm_get_auth_ttl (line 1731) slurmdbd: debug5: Entering ../../../src/common/slurm_protocol_api.c, slurm_get_auth_info slurmdbd: debug5: Leaving ../../../src/common/slurm_protocol_api.c, slurm_get_auth_info, auth_info=socket=/var/run/munge/munge_dbd.socket.2
-
Morris Jette authored
Reserve node weight value of INIFINITE for nodes which require reboot Avoid scheduling on nodes requiring reboot that are not IDLE (More work needed for backfill and will_run RPC).
-
- 01 Feb, 2016 7 commits
-
-
Tim Wickberg authored
parse_time - strlcpy not strncpy slurm_protocol_api - set tree width to 1 as a default, 0 leads coverity to warn about potential div/0 pmi2/setup.c - avoid strncpy entirely with a small rearrangement
-
David Gloe authored
contribs/cray/csm/slurmconfgen_smw.py - avoid erroneously including repurposed compute nodes in the list of nodes to start slurmd.
-
Danny Auble authored
-
Morris Jette authored
Added support for node features with or without counts
-
Tim Wickberg authored
$((10#$SLURM_API_MAJOR)) is bash-specific. replace with portable ${SLURM_API_MAJOR#0} which accomplishes the same thing. The first forces bash to treat the value as base-10 even with a leading zero, the second portable format strips a leading zero off.
-
Tim Wickberg authored
-
Morris Jette authored
Only do if job reboot requested
-