1. 09 Sep, 2014 1 commit
  2. 08 Sep, 2014 2 commits
  3. 05 Sep, 2014 3 commits
  4. 04 Sep, 2014 8 commits
  5. 03 Sep, 2014 8 commits
    • David Bigagli's avatar
      When creating job arrays the job specification files for each elements · dc7a1fca
      David Bigagli authored
      are hard links to the first element specification files. If the
      controller fails to make the links the files are copied instead.
      dc7a1fca
    • Danny Auble's avatar
      BLUEGENE - Fix backfill issue with backfilling jobs on blocks already · 2f8f1ebd
      Danny Auble authored
      reserved for higher priority jobs.
      2f8f1ebd
    • Danny Auble's avatar
      e79a3c16
    • Andrew Elwell's avatar
      Typo correction: s/apbasil/apkill/ · 656158d0
      Andrew Elwell authored
      656158d0
    • Nathan Yee's avatar
      test suite bug fixes · ab1c065a
      Nathan Yee authored
      I just ran the test suite for slurm 14.04.7, and have a few suggestions
      and bugfixes:
      
      Test 1.35 fails on our system (probably because we limit memory with
      cgroups).  Changing job_mem_opt from "--mem-per-cpu=64" to
      "--mem-per-cpu=192" in line 61 fixes the problem for us.
      
      Test 1.84 fails to recognise node names like "something1-2", ending up
      with node names "something1" instead.  Changing NodeName=(\w+) to
      NodeName=([^\s]+) fixes the problem.
      
      Test 1.97 reports FAILURE when it discovers that SelectTypeParameters is
      not CR_PACK_NODES.  Having "exit 0" instead of "exit 1" in line 50 is
      perhaps preferable.
      
      Test 2.18 fails because the variable $partition never gets set, so no
      idle nodes are found in line 215.  Setting $partition in globals.local
      helps, but should not be needed, IMO.  There is a function
      "default_partition" in globals that could perhaps be used.  The same
      applies to test 2.19.
      
      Test 12.2 fails on our system because the jobs get killed due to memory
      limit.  Increasing the "slack" in job_mem_limit from 4 to 10 in line 269
      fixes the problem for us.
      
      Tests 21.30, 21.31 and 21.32 fails when run as a non-privileged user.
      Perhaps they should test for it and exit with a warning instead, like
      many other tests.
      
      Test 22.1 fails on our system because the time zone is different from
      where the test was written.  The problem is that
      
      set midnight 1201766400
      
      is only correct in one time zone (and unfortunately for us, not in
      our :).  Perhaps one could use the GNU date command to get the correct
      seconds-since-epoch regardless of time zone.  Something like
      
      date +%s --date=2008-01-31
      
      should do it.  Unfortunately, I don't know enough Expect (tcl?) to
      suggest how to implement that.
      
      --
      Regards,
      Bjørn-Helge Mevik, dr. scient,
      Department for Research Computing, University of Oslo
      ab1c065a
    • Danny Auble's avatar
      Make return code INFINITE instead of 1 to mean failed instead of canceled · 137c53ef
      Danny Auble authored
      Which in this case is what has happened since the launch failed.
      137c53ef
    • Danny Auble's avatar
      Remove repeated batch complete if batch directory isn't able to be made · f9dca889
      Danny Auble authored
      since the slurmd will send the same message.
      f9dca889
    • Danny Auble's avatar
      9015479d
  6. 30 Aug, 2014 1 commit
  7. 29 Aug, 2014 1 commit
  8. 28 Aug, 2014 6 commits
  9. 27 Aug, 2014 2 commits
  10. 26 Aug, 2014 5 commits
  11. 25 Aug, 2014 3 commits