- 09 Apr, 2014 2 commits
-
-
Morris Jette authored
Rather than immediately invoking an execution of the scheduling logic on every event type that can enable the execution of a new job, queue its execution. This permits faster execution of some operations, such as modifying large counts of jobs, by executing the scheduling logic less frequently, but still in a timely fashion.
-
Danny Auble authored
If you have multiple partitions the output from sinfo -o "%D %F" would have unexpected results, hardly ever correct.
-
- 08 Apr, 2014 5 commits
-
-
Morris Jette authored
-
Morris Jette authored
Fix logic bugs for SchedulerParameters option of max_rpc_cnt. Scheduling would be delayed for job arrays and backfill scheduling would be disabled unless max_rpc_cnt > 0.
-
Danny Auble authored
-
Danny Auble authored
on Mixed state.
-
David Bigagli authored
-
- 07 Apr, 2014 4 commits
-
-
Danny Auble authored
This changes the behavior of license_update which it's current behavior makes for doubling license counts. This is ok, because the only place it is used expects the counts to be zeroed out afterwards.
-
Morris Jette authored
-
Danny Auble authored
in it. Signed-off-by: Danny Auble <da@schedmd.com>
-
Danny Auble authored
-
- 05 Apr, 2014 1 commit
-
-
Morris Jette authored
Disables job scheduling when there are too many pending RPCs
-
- 04 Apr, 2014 3 commits
-
-
Danny Auble authored
-
Danny Auble authored
This also reverts commit 8cff3b08 and ced2fa3f
-
Danny Auble authored
-
- 03 Apr, 2014 5 commits
-
-
Danny Auble authored
new associations were added since it was started.
-
Morris Jette authored
Permit user root to propagate resource limits higher than the hard limit slurmd has on that compute node has (i.e. raise both current and maximum limits). bug 674674674674674674
-
Morris Jette authored
Permit multiple batch job submissions to be made for each run of the scheduler logic if the job submissions occur at the nearly same time. bug 616
-
Morris Jette authored
if an job step's network value is set by poe, either by directly executing poe or srun launching poe, that value was not being propagated to the job step creation RPC and the network was not being set up for the proper protocol (e.g. mpi, lapi, pami, etc.). The previous logic would only work if the srun execute line explicitly set the protocol using the --network option.
-
Morris Jette authored
Permit multiple batch job submissions to be made for each run of the scheduler logic if the job submissions occur at the nearly same time. bug 616
-
- 02 Apr, 2014 2 commits
-
-
David Bigagli authored
-
Morris Jette authored
if an job step's network value is set by poe, either by directly executing poe or srun launching poe, that value was not being propagated to the job step creation RPC and the network was not being set up for the proper protocol (e.g. mpi, lapi, pami, etc.). The previous logic would only work if the srun execute line explicitly set the protocol using the --network option.
-
- 31 Mar, 2014 2 commits
-
-
David Bigagli authored
-
Marcin Stolarek authored
Prevent preemption of jobs in partition where PreemptMode=off
-
- 28 Mar, 2014 3 commits
-
-
Unknown authored
Define hwloc_const_bitmap_t as typdef hwloc_const_cpuset_t Build fails in task_cgroup_cpuset.c when using an older hwloc (v1.0.2) due to missing definition of hwloc_const_bitmap_t.
-
Danny Auble authored
-
Danny Auble authored
-
- 27 Mar, 2014 3 commits
-
-
David Bigagli authored
This reverts commit 084787c0. Conflicts: NEWS contribs/pmi2/pmi2_api.c src/plugins/mpi/pmi2/mpi_pmi2.c
-
Morris Jette authored
-
Franco Broi authored
Add support for job std_in, std_out and std_err fields in Perl API.
-
- 26 Mar, 2014 2 commits
-
-
Morris Jette authored
-
David Bigagli authored
processes.
-
- 25 Mar, 2014 2 commits
-
-
Morris Jette authored
Modify hostlist expressions to accept more than two numeric ranges (e.g. "row[1-3]rack[0-8]slot[0-63]")
-
Danny Auble authored
-
- 24 Mar, 2014 4 commits
-
-
Danny Auble authored
-
Danny Auble authored
-
Morris Jette authored
Previous logic would typically do list search to find job array elements. This commit adds two hash tables for job arrays. The first is based upon the "base" job ID which is common to all tasks. The second hash table is based upon the sum of the "base" job ID plus the task ID in the array. This will substantially improve performance for handling dependencies with job arrays.
-
Morris Jette authored
When slurmctld restarted, it would not recover dependencies on job array elements and would just discard the depenency. This corrects the parsing problem to recover the dependency. The old code would print a mesage like this and discard it: slurmctld: error: Invalid dependencies discarded for job 51: afterany:47_*
-
- 22 Mar, 2014 1 commit
-
-
Morris Jette authored
When adding or removing columns to most data types (jobs, partitions, nodes, etc.) on some system types an abort is generated. This appears to be because when columns displayed change, on some systems that changes the address of "model", while on others the address does not change (like my laptops). This fix explicitly sets the last_model to NULL when the columns are changed rather than relying upon the data structure's address to change.
-
- 21 Mar, 2014 1 commit
-
-
Danny Auble authored
be setup for 1 node jobs. Here are some of the reasons from IBM... 1. PE expects it. 2. For failover, if there was some challenge or difficulty with the shared-memory method of data transfer, the protocol stack might want to go through the adapter instead. 3. For flexibility, the protocol stack might want to be able to transfer data using some variable combination of shared memory and adapter-based communication, and 4. Possibly most important, for overall performance, it might be that bandwidth or efficiency (BW per CPU cycles) might be better using the adapter resources. (An obvious case is for large messages, it might require a lot fewer CPU cycles to program the DMA engines on the adapter to move data between tasks, rather than depend on the CPU to move the data with loads and stores, or page re-mapping -- and a DMA engine might actually move the data more quickly, if it's well integrated with the memory system, as it is in the P775 case.)
-