EduardoPinheiroandRicardoBianchini
COPPESystemsEngineeringFederalUniversityofRiodeJaneiroRiodeJaneiro,Brazil21945-970
DepartmentofComputerScience
UniversityofRochesterRochester,NY14627
edpin,ricardo@cos.ufrj.br
Abstract
Therecentimprovementsinworkstationandinterconnectionnet-workperformancehavepopularizedtheclustersofoff-the-shelfworkstations.However,theusefulnessoftheseclustersisyettobefullyexploited,mostlyduetotheinadequatemanagementofclusterresourcesimplementedbycurrentdistributedoperatingsystems.Inordertoeliminatethisproblemandapproachthecom-putationalpoweroflargeclustersofworkstations,inthispaperweproposeNomad,anefficientoperatingsystemforclustersofuniand/ormultiprocessors.Nomadincludesseveralimportantchar-acteristicsformoderncluster-orientedoperatingsystems:scala-bility,efficientresourcemanagementacrossthecluster,efficientschedulingofparallelanddistributedapplications,distributedI/O,faultdetectionandrecovery,protection,andbackwardcompati-bility.SomeofthemechanismsusedbyNomad,suchasprocesscheckpointingandmigration,canbefoundinpreviouslyproposedsystems.However,oursystemstandsoutforitsstrategyfordis-seminatinginformationacrosstheclusteranditsefficientman-agementofallclusterresources.Inaddition,Nomadishighlyscalableasitusesneithercentralizedcontrolnorextramessagestoimplementitsfunctionality,takingadvantageoftheI/Otrafficassociatedwithitsdistributedfilesystem.Ourpreliminaryevalua-tionoftheloadbalancingaspectofNomadshowsthatthepatternoffileaccessesinourdistributedfilesystemallowsforefficientandscalableloadbalancing.Ourmainconclusionisthatthecom-pleteimplementationofNomadwillmostlikelybeefficientandwillbeaniceplatformforfutureresearchonoperatingsystemsforclustersofworkstations.
1Introduction
Therecentimprovementsinworkstationandinterconnectionnet-workperformancehavepopularizedtheclustersofoff-the-shelfworkstationsasaplatformforbothhigh-performanceandinter-
tantcharacteristics:itsimplifiestheuse,programming,andman-agementofthecluster;itmanagesallclusterresources(CPUs,memories,andI/Odevices);itishighlyscalableandefficient;anditprovidestoleranceandrecoverytoworkstationfailures.Themechanismsusedtoimplementthesecharacteristicsin-cludeuniquecluster-wideprocessidentifications,processcheck-pointingandmigration,co-schedulingofconcurrentapplications,andadistributedfilesystem.Someofthesemechanisms,suchasprocesscheckpointingandmigration,canbefoundinothersystems.However,Nomadisuniqueintheparticularsetofchar-acteristicsitincludes,initsstrategyfordisseminatinginforma-tionacrossthecluster,andinthatitmanagesallclusterresources,whileusingneitherextramessagesnorcentralizedserverstoim-plementitsfunctionality.Nomadavoidssendingextramessagesbyrelyingonthecommunicationintrinsictoitsdistributedfilesystem(whichisrequiredforhighdiskI/Othroughputandfaulttoleranceanyway).Forinstance,inordertoimplementprocessmigration,loadinformationispiggybackedonfileaccessmes-sages.
AcompleteevaluationofallNomad’spoliciesandmechanismsisbeyondthescopeofthispaper.HerewefocussolelyontheloadbalancingaspectofNomad.Apreliminaryevaluationofthisas-pectinthecontextofaprototypeimplementationofNomadshowsthatthepatternoffileaccessesproducedbyNomad’sdistributedfilesystemandrealworkloadscaneffectivelybeusedasamech-anismfordistributingloadinformationacrossthecluster.Inaddi-tion,ourresultsshowthatNomadcanalmosteliminatetheperiodsofexcessivedemandforresourcesbyintelligentlymigratingpro-cesses.Basedontheseresultsandonourexperiencewiththeothermechanismsimplementedinoursystem,webelievethatNomadwillmostlikelybeanefficientanduser-friendlyoperatingsystemforclustersofuniandmultiprocessors.
Theremainderofthepaperisorganizedasfollows.ThenextsectiondiscusseseachofthecharacteristicsofNomadindetail,describesthearchitectureofthesystem,andpresentsthecurrentstatusofitsimplementation.Section3presentstheresultsofourpreliminaryevaluationofloadbalancingasimplementedinNomad.Section4discussestherelatedwork.Finally,section5drawsthemainconclusionsofourworktodate.
2Nomad
Inthissectionweaddressthedetailsofthefunctionality,architec-ture,andstatusoftheimplementationofNomadinturn.
2.1Functionality
Single-systemimage.Nomadsimplifiestheuse,programming,andmanagementoftheclusterbyprovidingasingle-systemim-ageofit.Theusercanutilizethesystemasifitwereasingleverypowerfulworkstation.Thisintegratedviewisbasedoncluster-wideuniqueprocessidentificationsandonmakingallaspectsofprocessmanagement(signaldelivery,forinstance)independentofwhereprocessesareactuallyrunning.
Efficientandcompleteresourcemanagement.Nomadeffi-cientlysupportshigh-performanceandinteractiveapplicationswithitsefficientresourcemanagementandscheduling.Thedis-
tributionofresourcedemandsisbasedontheintelligentinitialassignmentofprocessestoprocessorsandondynamicprocessmigration.Whenlaunchinganewapplication,Nomadchoosesalightlyloadedworkstationtohostthenewprocess(es).Butwithtime,ifaworkstationbecomesoverloaded(i.e.oneofitsresourcesisexhausted),Nomadchoosestheapplicationconsum-ingthemostoftheexhaustedresourcetobemigratedtoanotherworkstation.Themigrationitselfisinitiatedbytheoverloadedworkstation,whichsendsthechosenapplicationtoadestinationworkstationthatislightlyloadedwithrespecttotheresource.Inordertoreducethenumberoftimesmultipleoverloadedwork-stationsmigrateprocessestothesamedestination,eachsourcepicksadestinationrandomlyoutofthesetofworkstationsthatarelightlyloadedwithrespecttotheexhaustedresource.
Thewholeimageoftheapplicationismigratedtothedestina-tionworkstationandfuturesystemcallsareexecutedatthedes-tinationworkstation.Whenmakingitsassignmentandmigrationdecisions,Nomadtakesallaspectsofaworkstation’sloadintoaccount:demandforCPU,memory,diskI/O,andnetworkI/O.Efficientprocessscheduling.ProcessschedulinginNomadtar-getshighperformanceinmanyways.Sequentialapplicationsrun-ningonmultiprocessorsarescheduledconsideringtheaffinityofeachprocessfortheprocessoronwhichitranlast.Concurrentapplicationsareco-scheduled[11]orimplicitlyco-scheduled[1].Inco-schedulingallprocessesbelongingtoaparallelapplication(definedasaconcurrentapplicationrunningonamultiprocessor)arescheduledsimultaneously.Implicitco-schedulingisanap-proximationofco-schedulingforprocessesofadistributedap-plication(definedasaconcurrentapplicationwithprocessesscat-teredacrossmanyworkstations).Incontrastwithotherimplemen-tationsofco-scheduling(e.g.[3]),inimplicitco-schedulingallschedulingdecisionsaremadelocallybyeachworkstation,with-outtheneedforcoordinationmessagesoracentralizedcontroller.Scalability.Thescalabilityofthesystemisguaranteedsinceitdoesnotinvolvecentralizedserversorextramessagesinitsman-agementofresources,schedulingofdistributedapplications,andfaulttoleranceandrecovery.Inaddition,Nomadincludesadis-tributedandredundantfilesystemthatprovideshigh-performanceI/Obystripingfilesattheblocklevelacrossthedifferentdisksinthecluster.Inessence,ourdistributedfilesystemcanbeseenasasoftwareimplementationofRAID[5],whereeachblockisas-signedtoarandomlychosendisk,likeintheRAMAfilesystem[10].ThisassignmentofblocksleadstohighdiskI/Othroughput,whileavoidingcommunicationbottlenecks[10].Theredundancyinthefilesystemallowsforfaulttolerance.(Notethatfilesthatrequireneitherhighthroughputnorfaulttolerancecanbestoredlocally,bypassingthedistributedfilesystem.)
Itisimportanttoobservethatthedistributedfilesystemforcesaworkstationthatneedstoaccessafiletocommunicatewithapo-tentiallylargenumberofotherworkstationsinthecluster,asop-posedtoasingleworkstationasinNFS-stylefilesystems.Basedonthisobservation,werealizedthatNomadcouldavoidextramessagesinimplementingresourcemanagementandfaulttol-erance,byappendingtheinformationthatmustbedisseminatedthroughtheclustertothefileaccessmessages.Essentially,No-madavoidssendingextramessagesbyextendingeachfilesystemmessagewithafewextrabytes.
Efficientdisseminationofloadinformation.AnexampleofthispiggybackingofmessagesoccurswhenNomadusesfileaccessmessagestodisseminatetheloadinformationnecessarytoper-formprocessmigration.Theloadinformationofeachworkstationissentonitsfileaccessmessages.Thus,afileaccessrequestin-formsthereplierworkstationoftherequester’sloadinformation,whiletheaccessreplyinformstherequesterofthereplier’sloadinformation.(Asafall-backstrategy,aworkstationrunningNo-madmulticastsitsloadinformationtoafewotherworkstations,ifithasgonetoolong–30minutes,say–withoutcommunicatingwithanyothernode.)
Notehoweverthatthemotivationforusingthefileaccesspat-terntoguidethedisseminationofloadinformationisnotrestrictedtothedesiretoavoidextramessages;anotherreasonisthatusingthispatternseemslikeanaturalstrategytosupportloadbalancing.Morespecifically,thefileaccesspatternhastworelevantprop-ertiesasamechanismfordistributingloadinformation:(a)thecommunicationbetweenworkstationsoccursinbi-directional(re-quest/reply)form,asnecessaryformigration;and(b)idlework-stations(whichcanbenumerous)donotgeneratemessages.Un-derastripedfilesystemsuchasNomad’s,thefileaccesspatternhastheadditionalpropertythatasignificantnumberofworksta-tionswilllikelyreceivefileaccessmessagesfromeachnon-idleworkstation.ThesethreecharacteristicsandtheabsenceofextramessagesmakeprocessmigrationinNomadpotentiallymoreef-ficientthaninothersystems(e.g.Mosix,whichisinfactveryefficientintermsofmigration).
Faulttolerance.Nomadiscapableofdetectingthefailureofoneworkstationandexcludeitfromtheclusteruntilthefailuredisap-pearsorisrepaired.FailuredetectionisassociatedwiththefileaccesscommunicationinvolvedinNomad’sdistributedfilesys-tem.Ifareplierfailstoreplytoafileaccessrequestafteratime-outandretransmissionsperiod,thereplierisconsideredfaultyandtheaccessisdivertedtoaredundantdisk.Anyfuturemessagesbytherequesterwillnowinformotherworkstationsaboutthefailure.Aworkstationthatisinformedaboutafailuremustthendestroyanylocalprocessesbelongingtodistributedapplicationsaffectedbythefailure.Whenafaultyworkstationresumesnormalopera-tion,Nomadtriestorecoverbyaddingtheworkstationbacktothecluster,reconstructingthediskaccordingtotheredundancyinfor-mation,andrestoringtheprocessesthatwererunningonthework-stationpriortothefailure.Theprocessesbelongingtodistributedapplicationsaretheonlyonesthatarenotrestoredautomatically.
2.2Architecture
ThemaingoalofthearchitectureofNomadistomakeitasportableandfaulttolerantaspossible,butwithoutcompromis-ingourdesiredfunctionality.Thus,wedecidedtodividetheNo-madarchitectureintotwocomponents:amodifiedversionofaUnixoperatingsystemandalayerofuser-levelsoftware(mid-dleware).Asmodifyingthebaseoperatingsystemkernelsignifi-cantlywouldreducetheportabilityofNomad,wedecidedtokeepkernelmodificationstoaminimum.Basically,allwedotothebasekernelisenlargeitwithcodetoimplementprocesscheck-pointingandcodeimplementingafewnewsystemcalls.Thecheckpointingcodecancheckpointwholeapplications,regardlessofwhethertheyhaveopenfiles,pipes,semaphores,sharedmem-
orysegments,oraccesssharedlibraries.
SincethebaseoperatingsysteminterfaceisasubsetoftheNo-madinterface,theusersandapplicationscanstillutilizethebasekerneldirectly,bypassingNomadaltogether,whichisusefulforbackwardcompatibility.Inaddition,eachcopyofthemodifiedbasekernelremainsindependentofcopiesrunningonotherwork-stations,thuspromotingfaulttoleranceandeasierclusterrecon-figuration.
Theuser-levelsoftwareiscomposedbyadaemon(calledtheNomaddaemon),standardI/Oredirectiondaemons,andasetoftoolstoallowtheusersandapplicationstointeractwiththedae-mon.Eachoftheworkstationsintheclusterrunsacopyofthe(modified)baseoperatingsystemandoneNomaddaemon.Thedaemonrunsinthebackgroundwithsuper-userprivileges.Thedaemonperformsseveralimportanttasks:(1)itmaintainsthestateoftheapplicationsrunningontopofNomad;(2)itcol-lectsstatisticsabouttheuseoflocalresources;(3)itimplementstheloadbalancingpoliciesandmechanisms;(4)itimplementstheprocessschedulingpolicies;(5)itinteractswiththeuserandremotedaemonsforprocesslaunchingandmigration,anddis-tributedsignaldelivery;and(6)itlaunchesthestandardI/Oredi-rectiondaemonforeachapplicationthatislaunchedormigratedawayfromauser’sworkstation.Notethat,eventhoughtheNo-maddaemondoesnotinterferewithprocesseslauncheddirectlyontopofthebaseoperatingsystem,itdoestaketheirresourceusageintoconsideration.
ThestandardI/Oredirectiondaemonsredirectthestandardin-put,output,anderrorstreamstotheterminalwheretheuserstartedthecorrespondingapplication.Boththesourceanddes-tinationoftheapplicationgetaredirectiondaemon.
ThelastcomponentofthemiddlewareisthetoolsusedbyusersandapplicationstointeractwiththeNomaddaemon.Thetwomaintoolsaretheandcommands.NetspawnisusedtolaunchapplicationsontopofNomad,whilenetkillisusedtosendsignalstoprocesseslaunchedbyNomad.Bydefault,netspawninteractswiththelocaldaemon,whichintelligentlyse-lectsaworkstationfortheapplicationtorunon.Themainar-gumenttonetspawnistheapplication’sname,buttheusercanalsospecifythattheapplicationshouldnotbemigratedor,foradistributedapplication,specifythenumberandaddressesofwork-stationstobeused.Theprocesseslaunchedwithnetspawnhavecluster-wideuniqueidentificationsthatareindependentofwheretheyarerunning.
NetkillinteractswiththelocalNomaddaemonrequestingthatasignalbedeliveredtoacertainuniqueprocessidentification.Thedaemonisresponsibleforcheckingitsinternaltablesanddeter-miningwheretheprocessisrunning.
2.3ImplementationStatus
WenowhaveaprototypeimplementationofNomadupandrun-ning.Theprototypedoesnotincludefourfeaturesofthefull-blownNomaddesign:thenetworktrafficdemandsarenotmon-itored;applicationscurrentlyhavetoberelinkedtousedtheNo-maddistributedfilesystem;communicationbetweentheNomaddaemonsisimplementedwithUDP,insteadofahigherperfor-manceprotocolsuchasVIA[7];andthedisseminationofloadin-
Machineatto
4
gin
1
ripple
1
rye
1
vodka
#Disks1
2048/1998
1
64/58
1
64/58
1
64/58
1
Load Average of Kilo21.81.61.41.210.80.60.40.200:00Load Average2:004:006:008:0010:0012:00Time14:0016:0018:0020:0022:000:00Figure1:CPUrequirementsonworkstationKilo.Memory Used by Atto12010080Memory (Mb)60402000:002:004:006:008:0010:0012:00Time14:0016:0018:0020:0022:000:00Figure2:MemoryrequirementsonworkstationAtto.Disk Accesses by Ripple12010080Accesses per Minute60402000:002:004:006:008:0010:0012:00Time14:0016:0018:0020:0022:000:00Figure3:DiskI/OrequirementsonworkstationRipple.10987Number of Nodes6KnewCould Help54321016:1816:3216:4416:5817:0717:3917:5618:0818:2018:3318:4418:5619:0819:2219:3419:4820:0820:2820:4521:1621:2822:2723:00TimeFigure4:NumberofworkstationsthatlearnedaboutandcouldalleviatetheexcessivedemandonworkstationGin.
Machine
10122248612
Total
100
mins
1008237101211
mins
95
Table2:Durationoftheperiodsofoverdemand.
Thefigureshowsthatmorethan92%ofthetimeatleastoneworkstationnotonlyknewabouttheproblemwithGin,butalsocouldhavetakensomeofitsload.OtherworkstationsexhibitsimilarresultstoGin.Intheworstcase,87%ofthetimetherewasatleastoneworkstationreadytohelp.TheseresultsclearlyshowthatNomad’sstrategyfordisseminatingloadinformationshouldallowforgoodloadbalancing,sinceloadinformationisspreadwidelyenoughforloadtobebalancedmoreevenly.
However,formigrationtobeusefulwemustverifythatpe-riodsofexcessivedemandpersistforenoughtimetooffsetthemigrationoverhead.Table2liststhenumberoftheseperiodsac-cordingtotheirdifferentdurations:lessthanoneminute,fromoneminutetofiveminutes,fromfiveminutestotwentyminutes,andmorethantwentyminutes.Theseresultsshowthatthevastmajority(85%)ofalloverdemandperiodslastformorethanoneminute.Overdemanddurationsofmorethanoneminutearelongenoughtooffsettheoverheadofmigration,evenwhenprocesseshaveverylargecheckpoints[12].Furthermore,notethatintermsoftime,theoverdemandperiodsofoneminuteorlessaccountforanegligiblepercentageofthetotaloverdemandtime.
Basedonthesepositiveresults,wesimulatedthebehavioronNomad.Morespecifically,wesimulatedthemigrationofapro-cesseverytimeaproblemwithsomeworkstationisdetectedbyaremoteworkstationthatcanofferhelp.Tosimulatetheworstpossiblescenario,weassumedprocesseswiththesmallestpossi-bleimages(288KBytes),forcingthelargestpossiblenumberofmigrationswhenexcessivememorydemandsistheproblem.Ev-erymigrationpenalizesthesourceanddestinationworkstationswith0.030and0.098seconds,respectively,whicharethetimestakenontheseworkstationstomigratea288-KByteprocessinoursystem[12].Weletaperiodofoverdemandstandfor5seconds(samethresholdasusedinNomad)beforemigratingaprocessawayfromaworkstation.Tofurtherworsenthescenario,weas-sumethatmigratedprocessesrun(andtakeupresources)foreveratthedestinationworkstations.
Table3showstheresultsofthisexperiment.Fromlefttoright,thecolumnsofthetablelisttheworkstationname,thenumberofminuteswhenatleastoneresourcewasunderexcessivedemandwithoutNomad,thenumberofminuteswhenatleastoneresourcewasunderexcessivedemandwithNomad,thepercentagereduc-tioninresourceoverdemand,thenumberofprocessesmigratedtotheworkstation,andthenumberofprocessesmigratedawayfromtheworkstation.The“Total”rowliststhesumoftheoverdemandperiodswithoutandwithNomad,thepercentageofthistimere-
Machine
WithoutNomad
9555.32
Brain
941.56
Kilo
780.52
Rum
1532.04
Scotch
1029.3715901.86
Overdemand
99.98%
0.10
98.77%
15.89
99.65%
2.86
98.62%
5.18
98.52%99.52%
MigsIn
10
61
30
29
32
14
37
45
38252
Table3:ResultsofthesimulationofNomad.
ducedbyNomad,andthetotalnumberofprocessesmigratedinandoutofworkstations.
Theresultsinthetableclearlydemonstratethat,evenunderaworstcasescenarioforNomad,thesystemwouldhavebeenabletosignificantlyreducetheperiodsofresourceoverdemand.AllworkstationswouldhaveimprovedtheirresourceutilizationwithNomad,exceptforBrainwhichdidnotexhibitanyproblemswith-outNomad.Nevertheless,eveninthecaseofBrain,theperiodofresourceoverdemandcausedbyNomadamountedtonomorethan2minutesin7days.Overall,theclusterwouldhaveexperienceda99%reductioninthetimeatleastoneresourcewasexhaustedbyusingNomad.Asaresultofthisimprovement,applicationsshouldperformbetterunderoursystem.
Notehoweverthatadefinitiveevaluationoftheactualperfor-manceimprovementsachievablebyoursystemrequiresacom-pleteimplementationofit.Furthermore,westudiedthebehaviorofNomadassuminganacademicclusterenvironmentthatmaynotberepresentativeofothertypesofenvironments,suchasheavy-dutyscientificcomputinginstallations.Adefinitiveevaluationofthesystemshouldtaketheseotherclusterenvironmentsintocon-siderationaswell.
ofaworkstation,incorporatesadistributedfilesystem,performsitstasksinanon-centralizedfashionwithouttheuseofextrames-sages,andrecoversfromfaultsinsteadofjustdetectingthem.Mosixisalsotargetedattheefficientexecutionofsequentialanddistributedapplicationsinclustersofuniprocessorworksta-tions.LoadbalancinginMosixismoresophisticatedthaninGLUnix,butonlyCPUandmemoryusageareconsidered.Mi-grationdecisionsarebasedonloadinformationmessagesperiod-icallyexchangedamongadynamicsubsetofworkstations.Mosixcaneithermigratepagesorwholeprocesses.Incaseofmemoryproblems,Mosixavoidsswappingpagestodiskbymigratingpro-cessestootherworkstations,butdisregardstheCPUutilizationatthetargetworkstations.Pagemigrationisdonedirectlytothetar-getworkstation’smemory.However,whentheCPUutilizationishighonthesourceworkstation,thewholeprocessismigratedtothedestination.
NomadalsodiffersfromMosixinseveralways.Incon-trastwithMosix,Nomadconsidersmultiprocessors,considersabroadersetofclusterresources,incorporatesadistributedfilesys-tem,nevermigratespages,considerstheloadonthetargetwork-stationbeforemigratingaprocesstoit,andco-schedulesconcur-rentapplications.Intermsoftheirmigrationstrategies,Nomadusesthefileaccesspatternstoguidethedisseminationofloadin-formation,whilenotinvolvinganyextramessagestoimplementtheactualloadinformationexchanges.Althoughformanyclus-terconfigurationsitisunlikelythattheextramessagesinvolvedinMosixshouldcauseseriousoverheads,ourworkshowsthattheseextramessagesareunnecessarygivenadistributedfilesystem.
4RelatedWork
ThemigrationstrategydesignedforNomadiscompletelynovel.However,severaldistributedoperatingsystems(e.g.[6,9,8,2])sharesomeofthesamegoals,policies,ormechanismsofNo-mad.HereweconcentrateontheoperatingsystemsthataremostcloselyrelatedtoNomad:GLUnixandMosix.GLUnixseekstoexecuteinteractivesequentialanddistributedapplicationsef-ficientlyonaclusterofuniprocessorworkstations,keepingtheUnixI/Osemanticsunchanged,providingstaticanddynamicloadbalancing,anddetectingonefailureatatime.GLUnixmayhavescalabilityproblemssinceitusesacentralizedservertoassignuniqueprocessidentifiers,toco-scheduledistributedapplications,tokeepthestateofallworkstationsinthecluster,tomakede-cisionsaboutprocessmigration,andtodetectfailures.GLUnixisunderdevelopmentatUCBerkeleyand,initscurrentversion,doesnotseemtoachieveanyofitsgoalscompletely.
NomaddiffersfromGLUnixinseveralways.Nomadconsid-ersmultiprocessorworkstations,considersallaspectsoftheload
5ConclusionandFutureWork
ThispaperpresentedashortintroductiontoNomad,aneffi-cientoperatingsystemforclustersofuniand/ormultiproces-sors.ThemaingoalofNomadistoefficientlysupport(high-performanceorinteractive)parallel,distributed,andsequentialapplications.Nomadincludesseveralimportantcharacteristicsformoderncluster-orientedoperatingsystems,includingscala-bility,efficientresourcemanagementacrossthecluster,efficientschedulingofparallelanddistributedapplications,distributedI/O,andfaultdetectionandrecovery.Nomaddoesnotinvolveanyextramessagesforresourcemanagement,distributedscheduling,
andfaulttolerance,takingadvantageoftheI/Otrafficassociatedwithitsdistributedfilesystem.
ApreliminaryevaluationoftheloadbalancingaspectofNo-madshowedthatthepatternoffileaccessesproducedbyNomad’sdistributedfilesystemandrealworkloadscaneffectivelybeusedasamechanismfordistributingloadinformationacrosstheclus-ter.Inaddition,ourresultsshowthatNomadcanalmosteliminatetheperiodsofexcessivedemandforresourcesbyintelligentlymi-gratingprocesses.
Basedontheseresults,weexpectthecompleteimplementationofNomadtobeefficientandtobecomeaninterestingfoundationforresearchondistributedoperatingsystemsforclustersofwork-stations.Ourfutureworkincludescompletingtheimplementationandevaluationofthesystem.Rightafterthis,theNomadsourcecodewillbemadeavailabletothepublicfornon-commercialuse.
Acknowledgements
WewouldliketothankEnriqueCarrera,SilvioCanola,andthemembersofthesystemsgroupsatUFRJandRochesterfordiscus-sionsthathelpedimprovethispaper.WewouldalsoliketothankSergioTakeoKofujiforprovidinguswithaclusterofworksta-tionstoworkon.
References
[1]AndreaC.Arpaci-Dussau,DavidE.Culler,andAlanM.
Mainwaring.SchedulingwithImplicitInformationinDis-tributedSystems.InProceedingsoftheACMSigmetricsConferenceonMeasurementandModelingofComputerSystems,Madison,Wisconsin,June1998.[2]AmnonBarakandOrenLa’adan.TheMOSIXMulticom-puterOperatingSystemforHighPerformanceClusterCom-puting.JournalofFutureGenerationComputerSystems,13(4-5):361–372,Mar1998.[3]EliseuM.ChavesandValmirC.Barbosa.Timesharingin
HypercubeMultiprocessors.InProceedingsof4thIEEESymposiumonParallelandDistributedProcessing,pages354–359,Arlington,TX,Dec1992.[4]DavidR.Cheriton.TheVDistributedSystem.Communica-tionsoftheACM,31(3):314–333,Mar1988.[5]ToniCortes.SoftwareRAIDandParallelFileSystems.In
RajkumarBuyya,editor,HighPerformanceClusterCom-puting:ArchitecturesandSystems.PrenticeHall,1999.[6]FredDouglisandJ.Ousterhout.TransparentProcessMi-gration:DesignandAlternativesandtheSpriteImplemen-tation.Software:PracticeandExperience,21(8):757–785,Aug1991.[7]DaveDunning,GregRegnier,GaryMcAlpine,Don
Cameron,BillShubert,FrankBerry,AnneMarieMerritt,EdGronke,andChrisDodd.TheVirtualInterfaceArchitec-ture.IEEEMicro,18(2),1998.[8]D.Ghormley,D.Petrou,S.Rodrigues,A.Vahdat,andT.An-derson.GLUnix:aGlobalLayerUnixforaNetwork
ofWorkstations.Software:PracticeandExperience,Feb1998.
[9]YousefA.Kalidi,Jos´eM.Barnab´eu,VladaMatena,Ken
Shirriff,andMotiThadani.SolarisMC:AMultiComputerOS.InProceedingsof1996USENIXConference,January1996.[10]E.L.MillerandR.H.Katz.RAMA:AnEasy-to-Use,
HighPerformanceParallelFileSystem.ParallelComput-ing,4(23):419–446,June1997.[11]J.K.Ousterhout.SchedulingTechniquesforConcurrent
Systems.InProceedingsofthe3rdInternationalConfer-enceonDistributedComputingSystems,pages22–30,May1982.[12]EduardoPinheiro.Nomad:AnEfficientOperatingSystem
forClustersofUniandMultiprocessors.Master’sthesis,COPPESystemsEngineering,FederalUniversityofRiodeJaneiro,August1999.InPortuguese.[13]SunSystems.SunosandSolarisReferenceManuals.Sun
Systems,Inc.
8
因篇幅问题不能全部显示,请点此查看更多更全内容