- 14 Apr, 2015 2 commits
-
-
David Bigagli authored
-
Morris Jette authored
job_submit/lua: Enable reading and writing job environment variables. For example: if (job_desc.environment.LANGUAGE == "en_US") then ...
-
- 13 Apr, 2015 1 commit
-
-
Morris Jette authored
The error was being triggered by logic to collect accounting information starting and not being completed when testing to start a node ping RPC. In other words, the logic wasn't testing for the completion of one RPC against the start of that same RPC, but against the start of a different RPC. This change moves all of the timeout logic into the ping_nodes.c module where we can make sure that the timing of different RPCs do not get confused with each other. bug 1190
-
- 11 Apr, 2015 3 commits
-
-
Morris Jette authored
slurmd was logging: "Error reading step 2031886.0 memory limits" (with various job/step IDs) because it was treating SLURM_SUCCESS as an error rather than SLURM_ERROR bug 1589
-
Danny Auble authored
-
Morris Jette authored
Disable changes to GRES count while jobs are running on the node. Previous logic could clear GRES count data and generate counter underflow then when attemtping to deallocate the job's resources (including future job terminations when computing when and where pending jobs will start as part of backfill scheduling logic). bug 1589
-
- 09 Apr, 2015 2 commits
-
-
Danny Auble authored
if many steps/jobs finish at once.
-
Morris Jette authored
* Add "--thread-spec" option to salloc, sbatch and srun commands. This is the count of threads reserved for system use per node. * Add ability for scontrol to get/set job ThreadSpec * sivew: Add job ThreadSpec field to get/set * Modify select/cons_res to manage job allocations while leaving specialized threads for system use. * core_spec plugins: Fix system task binding logic and logging * Modify squeue output for thread_spec values * task/affinity and cgroup: Enhanced task binding logic
-
- 08 Apr, 2015 1 commit
-
-
David Bigagli authored
-
- 07 Apr, 2015 8 commits
-
-
Brian Christiansen authored
Bug 1587
-
Danny Auble authored
-
Morris Jette authored
switch/cray: If CR_PACK_NODES is configured, then set the environment variable "PMI_CRAY_NO_SMP_ENV=1" bug 1585
-
Danny Auble authored
-
Danny Auble authored
gating rpcs. Signed-off-by: Danny Auble <da@schedmd.com>
-
Danny Auble authored
-
Danny Auble authored
added back into the system.
-
Danny Auble authored
-
- 06 Apr, 2015 3 commits
-
-
Morris Jette authored
* Add "TopologyParam" configuration parameter. Optional value of "dragonfly" is supported. * select/linear: Add dragonfly topology support * select/cons_res: Add dragonfly topology support
-
Brian Christiansen authored
Bug 1578
-
David Bigagli authored
-
- 03 Apr, 2015 1 commit
-
-
Morris Jette authored
-
- 02 Apr, 2015 3 commits
-
-
David Bigagli authored
-
David Bigagli authored
-
Samuel Senoner authored
-
- 01 Apr, 2015 6 commits
-
-
Brian Christiansen authored
Bug 1550
-
David Bigagli authored
-
David Bigagli authored
their own jobs.
-
David Bigagli authored
-
Brian Christiansen authored
Bug 1469
-
David Bigagli authored
-
- 31 Mar, 2015 4 commits
-
-
Morris Jette authored
SPANK naming changes: For environment variables set using the spank_job_control_setenv() function, the values were available in the slurm_spank_job_prolog() and slurm_spank_job_epilog() functions using getenv where the name was given a prefix of "SPANK_". That prefix has been removed for consistency with the environment variables available in the Prolog and Epilog scripts. bug 1570
-
Morris Jette authored
Requests by normal user to reset a job priority (even to lower it) will result in an error saying to change the job's nice value instead.
-
Morris Jette authored
A non-administrator change to job priority will not be persistent except for holding the job. User's wanting to change a job priority on a persistent basis should reset it's "nice" value. Without this change a user might lower a job's priority, but Slurm will not lower it more based upon fair-share changes.
-
Morris Jette authored
Increase the MAX_PACK_MEM_LEN define to avoid PMI2 failure when fencing with large amount of ranks (to 1GB).
-
- 30 Mar, 2015 1 commit
-
-
David Bigagli authored
-
- 27 Mar, 2015 3 commits
-
-
Morris Jette authored
Verify that all plugin version numbers are identical to the component attempting to load them. Without this verification, the plugin can reference Slurm functions in the caller which differ (e.g. the underlying function's arguments could have changed between Slurm versions). NOTE: All plugins (except SPANK) must be built against the identical version of Slurm in order to be used by any Slurm command or daemon. This should eliminate some very difficult to diagnose problems due to use of old plugins.
-
Brian Christiansen authored
Bug 1469 Return values from void functions are unknown and were causing list_for_each to short ciruit processing of the job list.
-
David Bigagli authored
-
- 26 Mar, 2015 2 commits
-
-
David Bigagli authored
-
Morris Jette authored
Fix for misleading job submit failure errors sent to users. Previous error could indicate why specific nodes could not be used (e.g. too small memory) when other nodes could be used, but were not for another reason. bug 1537
-