Commits · e10bb90d10b899680b49d92e931dfb20de0e34cf · Manuel G. Marciani / ces_slurm_simulator

14 Aug, 2014 5 commits

Job array dependency refinement · e10bb90d

Morris Jette authored Aug 14, 2014

Job array dependency based upon state is now dependent upon the state of
the array as a whole (e.g. afterok requires ALL tasks to complete
sucessfully, afternotok is true if ANY tasks does not complete successfully,
and after requires all tasks to at least be started).

e10bb90d

Add job array state test functions · 3c6e5185

Morris Jette authored Aug 14, 2014

Add functions to test the state of a job array, for example
test_job_array_pending() returns true if ANY task of a job array
is pending and test_job_array_completed() returns true if ALL
tasks of a job array are completed.

3c6e5185

Introduced a new pending reason WAIT_QOS_MIN_CPUS. · 945550c9
David Bigagli authored Aug 13, 2014

945550c9
Introduced a new pending reason WAIT_QOS_MIN_CPUS · 08217f4d
David Bigagli authored Aug 13, 2014

08217f4d
Replace SLURM with Slurm on web pages · 54e6e9fa
Jacob Jenson authored Aug 13, 2014

54e6e9fa

13 Aug, 2014 11 commits
- Change the wording of sacctmgr man page and removed trailing · b03729f3
  David Bigagli authored Aug 13, 2014
```
spaces.
```
  b03729f3
- Implemented a new QOS limit MinCPUs. · fa932564
  Yannis Georgiou authored Aug 13, 2014
  
  fa932564
- Add max array task count to job info output · ecd8d0f3
  Morris Jette authored Aug 13, 2014
  
  ecd8d0f3
- Add support for max running job array tasks · 018546d7
  Morris Jette authored Aug 13, 2014
```
Count is reset running array tasks on reconfig
```
  018546d7
- Introduce a MAX_BATCH_REQUEUE. · f243a845
  David Bigagli authored Aug 13, 2014
  
  f243a845
- Merge branch 'slurm-14.03' · f7c08c72
  Morris Jette authored Aug 13, 2014
  
  f7c08c72
- Add best practices to user quickstart web page · 456b9f92
  Morris Jette authored Aug 13, 2014
```
Recommend use of job arrays and multiple job steps per job.
```
  456b9f92
- slurmstepd code reorganization · 0022fc75
  Morris Jette authored Aug 13, 2014
```
Cray needs for task_g_post_step() to be called before resetting
the CPU frequency. We also need to reset CPU frequency before
notifying srun of task completion. Logic reorganized to satisfy
this requirement
see bug 1011
```
  0022fc75
- sched/backfill: Set start time to earliest part · fdae6a05
  Morris Jette authored Aug 13, 2014
```
sched/backfill - Set expected start time of job submitted to multiple
partitions to the earliest start time on any of the partitions.
Previous logic would set the time to that of the last partition
tested.
```
  fdae6a05
- Revert increased sleep in test · 0c0326fa
  Morris Jette authored Aug 13, 2014
```
This issue was addressed differently and the original 1 second
sleep can be restored for easier performance comparison with
Slurm verstion 14.03.
```
  0c0326fa
- Idiot check for CLANG error · 6391d314
  Morris Jette authored Aug 12, 2014
  
  6391d314
12 Aug, 2014 14 commits
- Update slurm.conf man page. · 46b7c14d
  David Bigagli authored Aug 12, 2014
  
  46b7c14d
- Major update of job array web page · beb96e09
  Morris Jette authored Aug 12, 2014
  
  beb96e09
- Add man pages for new job array APIs · 4ab8470e
  Morris Jette authored Aug 12, 2014
  
  4ab8470e
- Merge branch 'slurm-14.03' · af04e80a
  Morris Jette authored Aug 12, 2014
  
  af04e80a
- Fix bad parameter name in web page · 977725e3
  Morris Jette authored Aug 12, 2014
  
  977725e3
- Fix scontrol format issue · 04673427
  Morris Jette authored Aug 12, 2014
  
  04673427
- Merge branch 'slurm-14.03' · 0556224b
  Morris Jette authored Aug 12, 2014
  
  0556224b
- Enable srun to submit job to multiple partition · e9413649
  Morris Jette authored Aug 12, 2014
```
Previously job would only run in first listed partition.
```
  e9413649
- fix gang sched for jobs in multiple partition · 90b398ee
  Morris Jette authored Aug 12, 2014
```
Fix gang scheduling for jobs submitted to multiple partitions.
Previous logic assumed the job's "partition" field contained a
single partition name, that in which the job is running. That
was recently changed in order to support job's being requeued,
which we want to be runable in all of it's valid partitions.
```
  90b398ee
- Fix bad parameter name in web page · 8e80afc3
  Morris Jette authored Aug 12, 2014
  
  8e80afc3
- Fix scontrol format issue · dfffdc15
  Morris Jette authored Aug 12, 2014
  
  dfffdc15
- Merge branch 'slurm-2.6' into slurm-14.03 · d85e34c3
  Morris Jette authored Aug 12, 2014
  
  d85e34c3
- Export "SLURM*" env vars if --export=NONE · 0213f9a0
  Morris Jette authored Aug 12, 2014
```
backport of commit 9b4f3634
```
  0213f9a0
- squeue --partition fixed with --priority · e40f743f
  Morris Jette authored Aug 12, 2014
```
Previously the --partition option would not work with the new
--priority option for jobs submitted to multiple partitions.
```
  e40f743f
11 Aug, 2014 3 commits
- Improve the pending reason description for various QOS limits. · 60254ff4
  David Bigagli authored Aug 11, 2014
  
  60254ff4
- Change includes for clean build · 2e6abc0d
  Morris Jette authored Aug 11, 2014
  
  2e6abc0d
- Add squeue --priority option · 1281f728
  Morris Jette authored Aug 11, 2014
```
Added squeue -P/--priority option that can be used to display pending jobs
in the same order as used by the Slurm scheduler even if jobs are submitted
to multiple partitions (job is reported once per usable partition).
```
  1281f728
08 Aug, 2014 7 commits
- Enforce MaxJobCount with job arrays · 32b69a8b
  Morris Jette authored Aug 08, 2014
```
Modify job array logic to properly support the MaxJobCount
configuration parameter with the new job array data structure.
```
  32b69a8b
- Fix MaxJobCount for job arrays · b881e598
  Morris Jette authored Aug 08, 2014
```
This part handles the decrement portion of logic.
```
  b881e598
- Update the acct_gather.conf.5 man. · b0928d2d
  Thomas Cadeaux authored Aug 08, 2014
  
  b0928d2d
- Rename variable · 4bc6bc97
  Morris Jette authored Aug 08, 2014
```
Rename the function local variable "job_count" in order to avoid
confusion with the global variable by the same name.
No change to logic.
```
  4bc6bc97
- Merge branch 'slurm-14.03' · c29eef6a
  Morris Jette authored Aug 08, 2014
  
  c29eef6a
- Reword description in NEWS · 4cb656e3
  Morris Jette authored Aug 08, 2014
  
  4cb656e3
- Fix expect file test with NFS · 375c34c9
  Morris Jette authored Aug 08, 2014
```
NFS file system delays was causing the test for file existence
to periodically fail. Adding an "ls" call syncs the file system
and fixes the problem.
```
  375c34c9