- 31 Mar, 2014 1 commit
-
-
Marcin Stolarek authored
Prevent preemption of jobs in partition where PreemptMode=off
-
- 26 Mar, 2014 1 commit
-
-
David Bigagli authored
processes.
-
- 25 Mar, 2014 1 commit
-
-
Danny Auble authored
-
- 24 Mar, 2014 1 commit
-
-
Morris Jette authored
When slurmctld restarted, it would not recover dependencies on job array elements and would just discard the depenency. This corrects the parsing problem to recover the dependency. The old code would print a mesage like this and discard it: slurmctld: error: Invalid dependencies discarded for job 51: afterany:47_*
-
- 21 Mar, 2014 2 commits
-
-
Danny Auble authored
-
Danny Auble authored
be setup for 1 node jobs. Here are some of the reasons from IBM... 1. PE expects it. 2. For failover, if there was some challenge or difficulty with the shared-memory method of data transfer, the protocol stack might want to go through the adapter instead. 3. For flexibility, the protocol stack might want to be able to transfer data using some variable combination of shared memory and adapter-based communication, and 4. Possibly most important, for overall performance, it might be that bandwidth or efficiency (BW per CPU cycles) might be better using the adapter resources. (An obvious case is for large messages, it might require a lot fewer CPU cycles to program the DMA engines on the adapter to move data between tasks, rather than depend on the CPU to move the data with loads and stores, or page re-mapping -- and a DMA engine might actually move the data more quickly, if it's well integrated with the memory system, as it is in the P775 case.)
-
- 20 Mar, 2014 2 commits
-
-
Danny Auble authored
than you really have.
-
Danny Auble authored
doesn't get chopped off.
-
- 19 Mar, 2014 2 commits
-
-
David Bigagli authored
-
Gennaro Oliva authored
a minus sign for options was intended.
-
- 18 Mar, 2014 4 commits
-
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
Some of these were resulting in the state of a job not being updated correctly to tools like sview.
-
Danny Auble authored
in waiting reason ReqNodeNotAvail.
-
- 17 Mar, 2014 1 commit
-
-
Danny Auble authored
-
- 15 Mar, 2014 2 commits
-
-
Morris Jette authored
-
Morris Jette authored
Add support for job array options in the qsub command, in #PBS options for sbatch scripts and set the appropriate environment variables in the spank_pbs plugin (PBS_ARRAY_ID and PBS_ARRAY_INDEX). Note that Torque uses the "-t" option and PBS Pro uses the "-J" option.
-
- 14 Mar, 2014 3 commits
-
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
slurm.conf. Rebooting daemons after adding nodes to the slurm.conf is highly recommended.
-
- 12 Mar, 2014 1 commit
-
-
Danny Auble authored
-
- 11 Mar, 2014 6 commits
-
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
Morris Jette authored
Rather than continuously retrying a step create for suspended jobs, add a sleep with exponential backoff
-
Morris Jette authored
If a job is suspended, log the step create failure using debug rather than info in slurmctld
-
- 10 Mar, 2014 1 commit
-
-
Morris Jette authored
The test for NRT_NULL_MAGIC failed to capture some problems if the pointer to the structure was NULL. This is an ammendment to commit 2a55aa0b
-
- 08 Mar, 2014 2 commits
-
-
Danny Auble authored
-
Danny Auble authored
Perhaps should also look into doing this for nodeinfo and libstate
-
- 07 Mar, 2014 6 commits
-
-
Danny Auble authored
-
Danny Auble authored
this would cause pmd's to hang.
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
- 06 Mar, 2014 4 commits
-
-
David Bigagli authored
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
code.
-