- 15 Jan, 2013 5 commits
-
-
Morris Jette authored
-
Morris Jette authored
Conflicts: src/slurmctld/acct_policy.c
-
Matthieu Hautreux authored
QoS limits enforcement on the controller side is based on a list of used_limits per user. When a user is not yet added to the list, which is common when the controller is restarted and the user has no running jobs, the current logic is to not check some of the "per user limits" and let the submission succeed. However, if one of these limits is a zero-valued limit, the check chould failed as it means that no job should be submitted at all as it would necessarily result in a crossing of the limit. This patch ensures that even when a user is not yet present in the per user used_limits list, the 0-valued limits are correctly treated.
-
David Bigagli authored
Add PriorityFlags value of "TICKET_BASED".
-
Morris Jette authored
-
- 14 Jan, 2013 12 commits
-
-
jette authored
-
Hongjia Cao authored
On job step launch failure, the function "slurm_step_launch_wait_finish()" will be called twice in launch/slurm, which causes srun to be aborted: srun: error: Task launch for 22495.0 failed on node cn6: Job credential expired srun: error: Application launch failed: Job credential expired srun: Job step aborted: Waiting up to 2 seconds for job step to finish. cn5 cn4 cn7 srun: error: Timed out waiting for job step to complete srun: Job step aborted: Waiting up to 2 seconds for job step to finish. srun: error: Timed out waiting for job step to complete srun: bitstring.c:174: bit_test: Assertion `(b) != ((void *)0)' failed. Aborted (core dumped) The attached patch(version 2.5.1) fixes it. But the message of " Job step aborted: Waiting up to 2 seconds for job step to finish. Timed out waiting for job step to complete " will still be printed twice.
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
Correction to CPU allocation count logic in for cores without hyperthreading.
-
Hongjia Cao authored
With jobs launched using srun directly which end abnormally, there will be a step-killed-message(slurmd[cn123]: *** 1234.0 KILLED AT ... WITH SIGNAL 9 ***) from each node. And/or there will be a task-exit-message(srun: error: task[0-1]: Terminated) for each node. For large scale jobs, these messages become tedious and the other error messages will be buried. The attached two patches(for slurm-2.5.1) introduce two environment variables to control the output of such messages: SLURM_STEP_KILLED_MSG_NODE_ID: if set, only the specified node will print the step-killed-message; SLURM_SRUN_REDUCE_TASK_EXIT_MSG: if set and non-zero, successive task exit messages with the same exit code will be printed only once.
-
Hongjia Cao authored
With jobs launched using srun directly which end abnormally, there will be a step-killed-message(slurmd[cn123]: *** 1234.0 KILLED AT ... WITH SIGNAL 9 ***) from each node. And/or there will be a task-exit-message(srun: error: task[0-1]: Terminated) for each node. For large scale jobs, these messages become tedious and the other error messages will be buried. The attached two patches(for slurm-2.5.1) introduce two environment variables to control the output of such messages: SLURM_STEP_KILLED_MSG_NODE_ID: if set, only the specified node will print the step-killed-message; SLURM_SRUN_REDUCE_TASK_EXIT_MSG: if set and non-zero, successive task exit messages with the same exit code will be printed only once.
-
Morris Jette authored
-
Morris Jette authored
-
Yair Yarom authored
-
Morris Jette authored
-
Morris Jette authored
-
- 11 Jan, 2013 10 commits
-
-
https://github.com/SchedMD/slurmjette authored
-
jette authored
User root or SlurmUser don't need valid sbcast credential
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
This can be useful for testing purposes
-
Morris Jette authored
-
jette authored
-
jette authored
-
Morris Jette authored
-
Morris Jette authored
-
- 10 Jan, 2013 13 commits
-
-
jette authored
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
Used to specify the communication protocol to be used for ALPS/BASIL.
-
Morris Jette authored
-
Morris Jette authored
-
jette authored
-
Danny Auble authored
-
jette authored
-
jette authored
-
Morris Jette authored
-
Morris Jette authored
-
Morris Jette authored
-