- 25 Mar, 2014 18 commits
-
-
Danny Auble authored
This reverts commit 9a2d863c.
-
Danny Auble authored
-
Don Lipari authored
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
Morris Jette authored
Update configuration file build web page for Slurm version 14.03 mostly to support Native Cray systems.
-
Hongjia Cao authored
fix the problem that allocated but drained node will be shown mixed by sinfo
-
Morris Jette authored
Add test for triple bracketed expression
-
Morris Jette authored
Modify hostlist expressions to accept more than two numeric ranges (e.g. "row[1-3]rack[0-8]slot[0-63]")
-
Danny Auble authored
-
jette authored
See bug 662
-
Morris Jette authored
If a hostlist expression contained a separator after two open brackets and one close bracket, this resulted in bad parsing. Before: $ scontrol show hostnames a[1-2]b[1,2] a[1-2]b[1] 2] After: $ scontrol show hostnames a[1-2]b[1,2] a1b1 a1b2 a2b1 a2b2
-
Morris Jette authored
-
- 24 Mar, 2014 17 commits
-
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
-
Danny Auble authored
able to run the exact number of cpu minutes in the limit.
-
Morris Jette authored
When poe is invoked (under srun) as user root, it generates a cryptic error message. I've added a clear error message describing the problem: error: POE will not run as user root Rather than just: ERROR: 0031-620 pm_SSM_write failed in sending the user/environment for taskid 0
-
Morris Jette authored
Previous logic would typically do list search to find job array elements. This commit adds two hash tables for job arrays. The first is based upon the "base" job ID which is common to all tasks. The second hash table is based upon the sum of the "base" job ID plus the task ID in the array. This will substantially improve performance for handling dependencies with job arrays.
-
Morris Jette authored
-
Morris Jette authored
When slurmctld restarted, it would not recover dependencies on job array elements and would just discard the depenency. This corrects the parsing problem to recover the dependency. The old code would print a mesage like this and discard it: slurmctld: error: Invalid dependencies discarded for job 51: afterany:47_*
-
- 22 Mar, 2014 1 commit
-
-
Morris Jette authored
When adding or removing columns to most data types (jobs, partitions, nodes, etc.) on some system types an abort is generated. This appears to be because when columns displayed change, on some systems that changes the address of "model", while on others the address does not change (like my laptops). This fix explicitly sets the last_model to NULL when the columns are changed rather than relying upon the data structure's address to change.
-
- 21 Mar, 2014 4 commits
-
-
Danny Auble authored
-
Danny Auble authored
be setup for 1 node jobs. Here are some of the reasons from IBM... 1. PE expects it. 2. For failover, if there was some challenge or difficulty with the shared-memory method of data transfer, the protocol stack might want to go through the adapter instead. 3. For flexibility, the protocol stack might want to be able to transfer data using some variable combination of shared memory and adapter-based communication, and 4. Possibly most important, for overall performance, it might be that bandwidth or efficiency (BW per CPU cycles) might be better using the adapter resources. (An obvious case is for large messages, it might require a lot fewer CPU cycles to program the DMA engines on the adapter to move data between tasks, rather than depend on the CPU to move the data with loads and stores, or page re-mapping -- and a DMA engine might actually move the data more quickly, if it's well integrated with the memory system, as it is in the P775 case.)
-
Danny Auble authored
-
Danny Auble authored
-