- 30 Jul, 2018 2 commits
-
-
Tim Wickberg authored
-
Tim Wickberg authored
-
- 28 Jul, 2018 6 commits
-
-
Tim Wickberg authored
-
Tim Wickberg authored
-
Tim Wickberg authored
-
Tim Wickberg authored
-
Tim Wickberg authored
-
Tim Wickberg authored
-
- 27 Jul, 2018 12 commits
-
-
Danny Auble authored
Bug 5468 This is a backport of commit cefc9ec1.
-
Danny Auble authored
-
Felip Moll authored
Bug 4918
-
Tim Wickberg authored
Collapse error message on to one line when they weren't previously, and fixup argument indentation as well while here. find . -name '*\.[ch]' -exec sed -i s/error\(\[\ \]*\"sched:\ /sched_error\(\"/ {} \;
-
Tim Wickberg authored
These calls will replace all these: error("sched: message...") with: sched_error("message...") This allows the _log_msg function to stop calling xstrncmp() on every log message while holding log_lock when SchedLogLevel > 0. This behavior has effectively removed all concurrency from slurmctld when SchedLogLevel is enabled, and prevented larger scale systems from being able to start up - see bug 3746 for an example of this.
-
Tim Wickberg authored
Update all calling locations, and as this is a static function change the name to _log_msg from log_msg while here.
-
Tim Wickberg authored
-
Tim Wickberg authored
-
Dominik Bartkiewicz authored
scheduling cycle. Primarily caused by EnforcePartLimits=ALL. Bug 5452
-
Tim Wickberg authored
-
Tim Wickberg authored
This requires all calls to slurm_addto_char_list() to change to slurm_addto_char_list_with_case().
-
Tim Wickberg authored
Rather than replace all existing references, keep the existing function name and signature, and break off the body into slurm_addto_char_list_with_case() with the lower_case_normalization option set to true. Locations that need to disable this case normalization will thus have access to this same function, just by the longer name with the extra boolean.
-
- 25 Jul, 2018 8 commits
-
-
Tim Wickberg authored
fdopen returns NULL on failure. It is impossible for (FILE *) to be less than zero, so this check would have always succeeded even in the event of a failure. CID 187087.
-
Tim Wickberg authored
-
Tim Wickberg authored
CID 187179.
-
Tim Wickberg authored
CID 187085.
-
Marshall Garey authored
not find any jobs. The actual syntax for that sacctmgr command is sacctmgr show runawayjobs [clustername] but sometimes people mistakenly do sacctmgr show runaway jobs in which case it would look for runaway jobs on a cluster named "jobs" and print the message "Runaway Jobs: No runaway jobs found" because the clustername was wrong. This patch changes that message to say "Runaway Jobs: No runaway jobs found on cluster jobs" so that people know they made a mistake in syntax.
-
Dominik Bartkiewicz authored
Bug 5098
-
Tim Wickberg authored
pam_slurm_adopt is way better, please use that instead.
-
Tim Wickberg authored
This fixes the previously mention double-locking issue. These locks were originally introduced in dd6cdddb. Commit b298df5d then moved the tres setup over to _load_job_state instead, and these are no longer required here. Bug 5469.
-
- 24 Jul, 2018 7 commits
-
-
Brian Christiansen authored
-
Tim Wickberg authored
assoc_mgr_clear_used_info() already manages its own locks, so wait to lock until after that's been called. Bug 5469.
-
Tim Wickberg authored
Leads to deadlock in 'scontrol reconfigure': 0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51 1 0x00007f61b1880801 in __GI_abort () at abort.c:79 2 0x00007f61b1fe2bd7 in __xassert_failed (expr=expr@entry=0x7f61b1ff2989 "_store_locks(locks)", file=file@entry=0x7f61b1ff2e80 "../../../../slurm/src/common/assoc_mgr.c", line=line@entry=2114, func=func@entry=0x7f61b1ff4bf8 <__func__.17903> "assoc_mgr_lock") at ../../../../slurm/src/common/xassert.c:57 3 0x00007f61b1ead017 in assoc_mgr_lock (locks=locks@entry=0x7f61ad6da610) at ../../../../slurm/src/common/assoc_mgr.c:2114 4 0x00005632675e4509 in _adjust_limit_usage (type=type@entry=0, job_ptr=job_ptr@entry=0x7f6190015000) at ../../../../slurm/src/slurmctld/acct_policy.c:719 5 0x00005632675e4b49 in acct_policy_add_job_submit (job_ptr=job_ptr@entry=0x7f6190015000) at ../../../../slurm/src/slurmctld/acct_policy.c:2515 6 0x00005632676767c7 in _restore_job_dependencies () at ../../../../slurm/src/slurmctld/read_config.c:2612 7 read_slurm_conf (recover=recover@entry=1, reconfig=reconfig@entry=true) at ../../../../slurm/src/slurmctld/read_config.c:1310 8 0x0000563267668d0f in _slurm_rpc_reconfigure_controller (msg=msg@entry=0x7f61ad6dade0) at ../../../../slurm/src/slurmctld/proc_req.c:3645 9 0x000056326766f646 in slurmctld_req (msg=0x7f61ad6dade0, arg=0x7f619c000dc0) at ../../../../slurm/src/slurmctld/proc_req.c:425 10 0x00005632675efbc9 in _service_connection (arg=<optimized out>) at ../../../../slurm/src/slurmctld/controller.c:1285 11 0x00007f61b1c386db in start_thread (arg=0x7f61ad6db700) at pthread_create.c:463 12 0x00007f61b196188f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 Bug 5469. This reverts commit 4b7ad3b6.
-
Tim Wickberg authored
-
Tim Wickberg authored
-
Tim Wickberg authored
-
Broderick Gardner authored
Bug 5248.
-
- 23 Jul, 2018 5 commits
-
-
Tim Wickberg authored
CID 187110.
-
Tim Wickberg authored
Avoid a double-locking issue by moving the locks previously required by assoc_mgr_clear_used_info() up in the calling path to _restore_job_dependencies(). Add commented out lock annotations to use later.
-
Tim Wickberg authored
Nothing ever checks this return code anyways.
-
Tim Wickberg authored
This also cleans up several locations that could try to repeatedly call close(). See prior commit for further details on why that is best avoided.
-
Tim Wickberg authored
Quoting part of the close() man page: Retrying the close() after a failure return is the wrong thing to do, since this may cause a reused file descriptor from another thread to be closed. This can occur because the Linux kernel always releases the file descriptor early in the close operation, freeing it for reuse; the steps that may return an error, such as flushing data to the filesystem or device, occur only later in the close operation.
-