1. 31 Dec, 2001 1 commit
    • Jim Garlick's avatar
      Information on running parallel jobs on an Elan interconnect is now · a84daa1f
      Jim Garlick authored
      in elan.runtime.requirements.txt.  I think most of this functionality belongs
      in the partition manager and job manager.
      The switch manager should monitor the health of interconnect resources,
      so switch.manager.design.txt is pretty much empty, awaiting a generic
      description of how we will do this for any interconnect.  The Elan specific
      bits of this would go in elan.runtime.requirements.txt.
      Hope that makes sense.
  2. 21 Dec, 2001 10 commits
    • Moe Jette's avatar
      · ea994281
      Moe Jette authored
      Construct initial drafts of assorted SLURM guides. - Jette
    • Moe Jette's avatar
      · 8cd2ed95
      Moe Jette authored
      Add ability for job to explicitly set node order being contiguous. - Jette
    • Moe Jette's avatar
      *** empty log message *** · c87471e9
      Moe Jette authored
    • Moe Jette's avatar
      · 16a961e0
      Moe Jette authored
      New documents, demonstration logs.
      Revised design documents to explicitly decline support of shared nodes
      and permit job specification of contiguous nodes. - Jette
    • Moe Jette's avatar
      · 64a67e72
      Moe Jette authored
      Controller now working !!! - Jette
    • Moe Jette's avatar
      · 3172ea6f
      Moe Jette authored
      OS name now contains "." between name and version number - Jette
    • Moe Jette's avatar
      · c038663d
      Moe Jette authored
      Controller.c builds now, debugging - Jette
    • Moe Jette's avatar
      · 200e1d06
      Moe Jette authored
      Minor updates - Jette
    • Jim Garlick's avatar
      Retracted comment about needing multiple capabilities for non-contiguous · f463283b
      Jim Garlick authored
      node sets.
      Added note about tying switch manager to partition manager to prefer
      allocation of contiguous node on an Elan network (interconnect dependent).
    • Jim Garlick's avatar
  3. 20 Dec, 2001 2 commits
    • Moe Jette's avatar
      · 426f69eb
      Moe Jette authored
      Draft 2 of job manager design. - Jette
    • Moe Jette's avatar
      · 09c43048
      Moe Jette authored
      Various managers built as libraries. Initial version of Controller program. - Jette
  4. 19 Dec, 2001 6 commits
    • Moe Jette's avatar
      · 5b82fb54
      Moe Jette authored
      zero out node and partition records to start (string dumps look cleaner).
      added function Dump_Part_Records. - Jette
    • Moe Jette's avatar
      · 31ac4540
      Moe Jette authored
      Partition.h and Mach_Stat_Mgr.h contents merged into slurm.h.
      General clean-up. Remove string lenght limits. - Jette
    • Moe Jette's avatar
      · 851dcbba
      Moe Jette authored
      New LCM.arch.* files based upon conversion from LCM.arch.ps, which provides
      better resolution than the LCM.arch.gif file. Minor changes to job.manager.design.txt
    • Moe Jette's avatar
      · 10cb91d0
      Moe Jette authored
      Fix some typos - Jette
    • Moe Jette's avatar
      · 1247750a
      Moe Jette authored
      Major early revisions to phases (now four) and contents. - Jette
    • Chris Dunlap's avatar
      Gimped the updated GIF from Moe. · 7ee29a80
      Chris Dunlap authored
  5. 18 Dec, 2001 3 commits
    • Moe Jette's avatar
      · 37a389e0
      Moe Jette authored
      Initial version of Partition Management Module - Jette
    • Moe Jette's avatar
      · ba7b452f
      Moe Jette authored
      Added new function Tally_Node_CPU(), reports how many CPUs are in a given
      list of nodes, to be used by Partition_mgr.c  - Jette
    • Moe Jette's avatar
      · 86afd19b
      Moe Jette authored
      Initial version of job manager document - Jette
  6. 14 Dec, 2001 5 commits
    • Moe Jette's avatar
      · fb99f72c
      Moe Jette authored
      Change references from "Pool" to "Partition".
      Change node names from those containing domain name to just node name as
      returned by hostname. - Jette
    • Moe Jette's avatar
      · 00d4501d
      Moe Jette authored
      Machine names now not fully qualified (lx01 instead of lx01.llnl.gov). - Jette
    • Moe Jette's avatar
      · d28334d5
      Moe Jette authored
      Initial version of partition management infrastructure.
      Changed name of "Pool" to "Partition" in machine status infrastructure. - Jette
    • Moe Jette's avatar
      · d4087674
      Moe Jette authored
      Added function to perform raw dump of node records for DPCS: Dump_Node_Records. - Jette
    • Moe Jette's avatar
      · 37a0b1f3
      Moe Jette authored
      Get_Mach_Stat fully functional.
      TmpDisk field of node record made of type long. - Jette
  7. 13 Dec, 2001 1 commit
    • Moe Jette's avatar
      · e11def5e
      Moe Jette authored
      Get_Mach_Stat recording current machine status
      Presently reporting Name, OS, Speed, RealMemory, and VirtualMemory - Jette
  8. 12 Dec, 2001 5 commits
    • Moe Jette's avatar
      · 2c321ad6
      Moe Jette authored
      Updates based upon progress to date on Phase one. - Jette
    • Moe Jette's avatar
      · c60d8ade
      Moe Jette authored
      General clean up. Mach_Stat_Mgr.h created from header info previously in
      Mach_Stat_Mgr.c, updated Mach_Stat_Mgr.c to enable record delete and full
      database save/restore for failure recovery. - Jette
    • Moe Jette's avatar
      Initial revision · 8421244e
      Moe Jette authored
    • Moe Jette's avatar
      · 65914859
      Moe Jette authored
      Initial Machine Status Manager and Daemon design document - Jette
    • Moe Jette's avatar
      · 01d4eac4
      Moe Jette authored
      Initial design document for overlord daemon - Jette
  9. 26 Nov, 2001 1 commit
  10. 03 Nov, 2001 1 commit
  11. 02 Nov, 2001 2 commits
  12. 01 Nov, 2001 1 commit
  13. 31 Oct, 2001 2 commits