- 19 Apr, 2017 1 commit
-
-
Morris Jette authored
-
- 18 Apr, 2017 8 commits
-
-
Brian Christiansen authored
In 6eec8022, the cluster's recv connection is now being destroyed when the cluster is being destroyed. The problem that showed itself was that when a remote cluster is removed from the federation, the controller calls slurmdb_destroy_federation_rec() which destroys the cluster's in the list. Both the persistent recv thread and the cluster's recv are pointing to the same thing so when the controller removed the recv persistent connection the recv thread was pointing to bad memory.
-
Morris Jette authored
-
Danny Auble authored
-
Morris Jette authored
-
Tim Wickberg authored
-
Morris Jette authored
-
Danny Auble authored
end at the same time. Bug 3604.
-
Morris Jette authored
-
- 17 Apr, 2017 3 commits
-
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
-
- 15 Apr, 2017 18 commits
-
-
Morris Jette authored
-
Morris Jette authored
Modify sreport to report all jobs in federation by default. Also add --local option.
-
Morris Jette authored
Modify sacct to accept "--cluster all" option (in addition to the old "--cluster -1", which is still accepted). This makes sacct behave more like the other commands.
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
Modify sacct to report all jobs in federation by default. Also add --local option.
-
Morris Jette authored
-
Morris Jette authored
This RPC is still needed for version 17.02 commands executed against version 17.11 slurmctld daemon
-
Tim Wickberg authored
-
Morris Jette authored
perl api.
-
Morris Jette authored
This changed in 17.11 where if the size was 0 we would return 0 which messes up the perl api. Bug 3644
-
Morris Jette authored
jobs have a priority that is lower than the selected job. Previous logic would permit other jobs with equal priority (no jobs with higher priority). Bug 3650
-
Tim Wickberg authored
-
Dominik Bartkiewicz authored
circumstances.
-
Tim Shaw authored
array element per line. Bug 3573
-
Bill Brophy authored
If the depend_list is NULL or has zero elements, the string should be cleared as well. Bug 3651.
-
Thomas Opfer authored
The field needs to have its own copy, otherwise the pointer will become invalid when xfree()'d by a separate array task. Bug 3665.
-
Alejandro Sanchez authored
So that it is the same max length as in src/common/env.c. Used for explicitly laying out tasks on large CPU count nodes (e.g., KNL). Bug 3675.
-
- 14 Apr, 2017 10 commits
-
-
Brian Christiansen authored
For use in federated environments.
-
Brian Christiansen authored
-
Brian Christiansen authored
-
Brian Christiansen authored
in any case.
-
Dong Ahn authored
As specified in MPIR debug interface (https://www.mpi-forum.org/docs/mpir-specification-10-11-2010.pdf), the presence of the MPIR_partial_attach_ok symbol should inform the debugger that the initial startup synchronization is implemented in such a way that the tool need not attach nor continue MPI processes that the user is not interested in controlling. To implement this, SLURM chose to send SIGCONT to those processes that are not attached by the debugger. However, the old code does not reliably detect the condition in which a process is traced by the debugger, and this has lead to various side effects. On some systems (e.g., TOSS2), the old code sends SIGCONT to all of the target processes including those attached by the debugger. On newer systems (e.g., TOSS3), it does not send SIGCONT to the target processes at all. It seems that one of the reasons for such undefined behavior is the use of CLONE_PTRACE. @grondo found no documentation that indicates CLONE_PTRACE is for the case where the process is being attached by a debugger. More importantly, this code is matching clone(2) flags to proc(5) process flags, which are not the same, as task->flags defined as PF_* flags from kernel source include/linux/sched.h. This patch fixes these problems by replacing the old detection logic with ones based on the TracerPid field in /proc/<pid>/status. From proc(5), TracerPid: PID of process tracing this process (0 if not being traced).
-
Thomas Opfer authored
Improve job scheduling sort after sorting by priority we now sort by submit time and then by job id. We used to not consider submit time. This handles the case where the job_ids have rolled or we are doing federation scheduling. Bug 3524
-
Danny Auble authored
-
Morris Jette authored
All problems introduced in the course of changing un/pack logic required for removing pack jobs logic
-
Morris Jette authored
-
Morris Jette authored
bug 926
-