1. 31 Dec, 2001 1 commit
    • Jim Garlick's avatar
      Information on running parallel jobs on an Elan interconnect is now · a84daa1f
      Jim Garlick authored
      in elan.runtime.requirements.txt.  I think most of this functionality belongs
      in the partition manager and job manager.
      
      The switch manager should monitor the health of interconnect resources,
      so switch.manager.design.txt is pretty much empty, awaiting a generic
      description of how we will do this for any interconnect.  The Elan specific
      bits of this would go in elan.runtime.requirements.txt.
      
      Hope that makes sense.
      a84daa1f
  2. 21 Dec, 2001 10 commits
    • Moe Jette's avatar
      · ea994281
      Moe Jette authored
      Construct initial drafts of assorted SLURM guides. - Jette
      ea994281
    • Moe Jette's avatar
      · 8cd2ed95
      Moe Jette authored
      Add ability for job to explicitly set node order being contiguous. - Jette
      8cd2ed95
    • Moe Jette's avatar
      *** empty log message *** · c87471e9
      Moe Jette authored
      c87471e9
    • Moe Jette's avatar
      · 16a961e0
      Moe Jette authored
      New documents, demonstration logs.
      Revised design documents to explicitly decline support of shared nodes
      and permit job specification of contiguous nodes. - Jette
      16a961e0
    • Moe Jette's avatar
      · 64a67e72
      Moe Jette authored
      Controller now working !!! - Jette
      64a67e72
    • Moe Jette's avatar
      · 3172ea6f
      Moe Jette authored
      OS name now contains "." between name and version number - Jette
      3172ea6f
    • Moe Jette's avatar
      · c038663d
      Moe Jette authored
      Controller.c builds now, debugging - Jette
      c038663d
    • Moe Jette's avatar
      · 200e1d06
      Moe Jette authored
      Minor updates - Jette
      200e1d06
    • Jim Garlick's avatar
      Retracted comment about needing multiple capabilities for non-contiguous · f463283b
      Jim Garlick authored
      node sets.
      
      Added note about tying switch manager to partition manager to prefer
      allocation of contiguous node on an Elan network (interconnect dependent).
      f463283b
    • Jim Garlick's avatar
      f53d437c
  3. 20 Dec, 2001 2 commits
    • Moe Jette's avatar
      · 426f69eb
      Moe Jette authored
      Draft 2 of job manager design. - Jette
      426f69eb
    • Moe Jette's avatar
      · 09c43048
      Moe Jette authored
      Various managers built as libraries. Initial version of Controller program. - Jette
      09c43048
  4. 19 Dec, 2001 6 commits
    • Moe Jette's avatar
      · 5b82fb54
      Moe Jette authored
      zero out node and partition records to start (string dumps look cleaner).
      added function Dump_Part_Records. - Jette
      5b82fb54
    • Moe Jette's avatar
      · 31ac4540
      Moe Jette authored
      Partition.h and Mach_Stat_Mgr.h contents merged into slurm.h.
      General clean-up. Remove string lenght limits. - Jette
      31ac4540
    • Moe Jette's avatar
      · 851dcbba
      Moe Jette authored
      New LCM.arch.* files based upon conversion from LCM.arch.ps, which provides
      better resolution than the LCM.arch.gif file. Minor changes to job.manager.design.txt
      -Jette
      851dcbba
    • Moe Jette's avatar
      · 10cb91d0
      Moe Jette authored
      Fix some typos - Jette
      10cb91d0
    • Moe Jette's avatar
      · 1247750a
      Moe Jette authored
      Major early revisions to phases (now four) and contents. - Jette
      1247750a
    • Chris Dunlap's avatar
      Gimped the updated GIF from Moe. · 7ee29a80
      Chris Dunlap authored
      7ee29a80
  5. 18 Dec, 2001 3 commits
    • Moe Jette's avatar
      · 37a389e0
      Moe Jette authored
      Initial version of Partition Management Module - Jette
      37a389e0
    • Moe Jette's avatar
      · ba7b452f
      Moe Jette authored
      Added new function Tally_Node_CPU(), reports how many CPUs are in a given
      list of nodes, to be used by Partition_mgr.c  - Jette
      ba7b452f
    • Moe Jette's avatar
      · 86afd19b
      Moe Jette authored
      Initial version of job manager document - Jette
      86afd19b
  6. 14 Dec, 2001 5 commits
    • Moe Jette's avatar
      · fb99f72c
      Moe Jette authored
      Change references from "Pool" to "Partition".
      Change node names from those containing domain name to just node name as
      returned by hostname. - Jette
      fb99f72c
    • Moe Jette's avatar
      · 00d4501d
      Moe Jette authored
      Machine names now not fully qualified (lx01 instead of lx01.llnl.gov). - Jette
      00d4501d
    • Moe Jette's avatar
      · d28334d5
      Moe Jette authored
      Initial version of partition management infrastructure.
      Changed name of "Pool" to "Partition" in machine status infrastructure. - Jette
      d28334d5
    • Moe Jette's avatar
      · d4087674
      Moe Jette authored
      Added function to perform raw dump of node records for DPCS: Dump_Node_Records. - Jette
      d4087674
    • Moe Jette's avatar
      · 37a0b1f3
      Moe Jette authored
      Get_Mach_Stat fully functional.
      TmpDisk field of node record made of type long. - Jette
      37a0b1f3
  7. 13 Dec, 2001 1 commit
    • Moe Jette's avatar
      · e11def5e
      Moe Jette authored
      Get_Mach_Stat recording current machine status
      Presently reporting Name, OS, Speed, RealMemory, and VirtualMemory - Jette
      e11def5e
  8. 12 Dec, 2001 5 commits
    • Moe Jette's avatar
      · 2c321ad6
      Moe Jette authored
      Updates based upon progress to date on Phase one. - Jette
      2c321ad6
    • Moe Jette's avatar
      · c60d8ade
      Moe Jette authored
      General clean up. Mach_Stat_Mgr.h created from header info previously in
      Mach_Stat_Mgr.c, updated Mach_Stat_Mgr.c to enable record delete and full
      database save/restore for failure recovery. - Jette
      c60d8ade
    • Moe Jette's avatar
      Initial revision · 8421244e
      Moe Jette authored
      8421244e
    • Moe Jette's avatar
      · 65914859
      Moe Jette authored
      Initial Machine Status Manager and Daemon design document - Jette
      65914859
    • Moe Jette's avatar
      · 01d4eac4
      Moe Jette authored
      Initial design document for overlord daemon - Jette
      01d4eac4
  9. 26 Nov, 2001 1 commit
  10. 03 Nov, 2001 1 commit
  11. 02 Nov, 2001 2 commits
  12. 01 Nov, 2001 1 commit
  13. 31 Oct, 2001 2 commits