1. 17 Apr, 2013 3 commits
  2. 16 Apr, 2013 2 commits
  3. 12 Apr, 2013 3 commits
    • Danny Auble's avatar
      ca3c2fa1
    • Danny Auble's avatar
      Replaced ipmi.conf with generic acct_gather.conf file for all acct_gather · c1793844
      Danny Auble authored
      plugins.  For those doing development to use this follow the model set
      forth in the acct_gather_energy_ipmi plugin.
      c1793844
    • Morris Jette's avatar
      gres/gpu - Fix for gres.conf file with multiple files on a single line · ee6a7066
      Morris Jette authored
      We're in the process of setting up a few GPU nodes in our cluster, and
      want to use Gres to control access to them.
      
      Currently, we have activated one node with 2 GPUs.  The gres.conf file
      on that node reads
      
      ----------------
      
      Name=gpu Count=2 File=/dev/nvidia[0-1]
      Name=localtmp Count=1800
      ----------------
      
      (the localtmp is just counting access to local tmp disk.)  Nodes without
      GPUs have gres.conf files like this:
      
      ----------------
      
      Name=gpu Count=0
      Name=localtmp Count=90
      ----------------
      
      slurm.conf contains the following:
      
      GresTypes=gpu,localtmp
      Nodename=DEFAULT Sockets=2 CoresPerSocket=8 ThreadsPerCore=1 RealMemory=62976 Gres=localtmp:90 State=unknown
      [...]
      Nodename=c19-[1-16] NodeHostname=compute-19-[1-16] Weight=15848 CoresPerSocket=4 Gres=localtmp:1800,gpu:2 Feature=rack19,intel,ib
      
      Submitting a job with sbatch --gres:1 ... sets the CUDA_VISIBLE_DEVICES for
      the job.  However, the values seem a bit strange:
      
      - If we submit one job with --gres:1, CUDA_VISIBLE_DEVICES gets the value 0.
      
      - If we submit two jobs with --gres:1 at the same time,
        CUDA_VISIBLE_DEVICES gets the value 0 for one job, and 1633906540 for
        the other.
      
      - If we submit one job with --gres:2, CUDA_VISIBLE_DEVICES gets the
        value 0,1633906540
      ee6a7066
  4. 11 Apr, 2013 3 commits
  5. 10 Apr, 2013 3 commits
  6. 09 Apr, 2013 5 commits
  7. 06 Apr, 2013 1 commit
  8. 02 Apr, 2013 9 commits
  9. 01 Apr, 2013 1 commit
  10. 29 Mar, 2013 2 commits
  11. 27 Mar, 2013 3 commits
  12. 26 Mar, 2013 2 commits
  13. 25 Mar, 2013 2 commits
  14. 22 Mar, 2013 1 commit
    • Morris Jette's avatar
      Select/cray - Modify build to enable direct use of libslurm library. · 7d4f145a
      Morris Jette authored
      These changes are required so that select/cray can load select/linear,
        which is a bit more complex than the other select plugin structures.
      Export plugin_context_create and plugin_context_destroy symbols from
        libslurm.so.
      Correct typo in exported hostlist_sort symbol name
      Define some functions in select/cray to avoid undefined symbols if
        the plugin is loaded via libslurm rather than from a slurm command
        (which has all of the required symbols)
      7d4f145a