




版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
PAGE
10
AICourseProjectDocument:ArtificialIntelligence-BasedConnect4PlayerUsingPython
Doneby:AbdalazizSawwanMay.05.2021
Supervisedby:Dr.PeiWang
Contents
Introduction 3
TheoreticalAnalysis 4
RelatedWork 5
TheGeneralEmployedAITechniques 5
TheMinimaxAlgorithm 5
TheAlpha-BetaPruningAlgorithm 8
TheUtility-FunctionHeuristicEvaluator 10
TheImplementation 13
TheBasicPanelandItsUser-FriendlyInterface 13
TheImplementationoftheAIAgentandSomeOtherPartsof
theCode 15
TheEfficiencyoftheAIModel 16
Conclusion,PersonalComments,andFinalThoughts 18
References 18
Introduction
ConnectFourisaboardgamethatisplayedbyexactlytwoplayers,playersinitareassignedtodifferentcolorsandthentaketurnsdroppingcoloreddiscsintothesuspendedgrid.Thegame’sgridhassevencolumnsandsixrows.Thepiecesfallstraightdown,occupyingthelowestavailablespace.Figure1illustratesthegamepanel.
Figure1:Anillustrationofthegamepanel(Sourcein[1].)
Themaingoalofthegameistobethefirstplayertohaveeitherhorizontal,vertical,ordiagonallineoffoursame-coloreddiscs.ItiswellknownthatConnectFourisasolvedgame,i.e.thereisaspecificknownstrategybywhichthefirstplayercanalwayswinbyplayingthecorrectplays.Hence,inthisproject,wetrytoplay“semi-perfectly”againsttheAIandobservetheresults.
Furthermore,unlikemostofthecardgames,ConnectFourdoesprovidefullinformation,wherewhenaplayeratatimeplays,allthetwoplayersgetalltheinformationregardingmovesthathavedeterministicallyalreadytakenplaceandregardingallmovesthatcantakeplaceforthenextstep,inagivengamestate.ThismakesimplementinganAIplayerofthegamemorefeasibleforthepurposeofthisproject.
Theprojectincludesfirst,implementingthegameenvironmentitselfinasuitableanduser-friendlyway,andsecond,animplantationofanAIplayerwhowecanconfigureitsparametersandhardness.Finally,ageneralevaluationoftheAIispresented.
TheoreticalAnalysis
Startingfromthestandardgame,wewillhaveapanelof6×7=42locations,eachlocationhasthreepossiblestatesthatcanhave;beingempty,havingthediscofthefirstplayer,orhavingthediscofthesecondplayer.Thatwoulddirectlygivearoughupper-boundof342≈1.1×1020possiblegamesituationsthatthesearchspaceofthegamemayhave.
However,wecancomputeamoreaccurateandtightupperboundofthepossiblesituationsbyevaluatingthesequenceofpossiblegamescorrespondingtoaspecificnumberofdiscsplayedsofar,andthen,justsumminguptheelementsofthissequenceofnumbers.
Thissequenceofpossiblesituationsafteranumberofturnswillstartfromzerosituationsafterzeroturns,thensevensituationsafteroneturn(thefirstplayerhassevenpossiblepositionstoplacetheirdiscon),then7×7=49positionsafterthesecondturn.
Afterthesecondturn,itbecomesalittlebittrickier,calculatingthenumberofpossiblesituationsafterthreeturnsisnottrivial;thenumberofpossible
situations at that point will be
7×7×1+7×6×
5+7×6×2=238
2
situations.Whichisderivedafterconsideringthatinsomescenarios,differentordersofplacingthediscsofthefirstplayermayproducethesamesituationofthegame.Here,weconsiderallthecases.
Thewholesequenceofpossiblesituationsafteraspecificnumberofturns,andstartingfromzeroisgivenbythefollowingsequence:1,7,49,238,1120,4263,16422,54859,184275,…[2].
Aftercalculatingthenumberofsituationsafterconsideringall42possibleturnsaswellastheaeroturn,wewillendupwithatotalnumberofsituationsthatisaround4.5×1012situations.
Forausualboardgame,anorderoftrillionsofpossiblestatesinthesearchspaceisrelativelyintoolarge(likethesearchspaceinChessorGoboardgames),anditisnottoolowthatasimplebrute-forcealgorithmthatexhaustsallthesituationsisfeasiblealonetodothat.ThisprovesthatourchoicetoconsiderthisgamespecificallywasagoodchoicebecauseimplementinganAI
agentthatdoesnottriviallyexhaustsallthepossiblescenarioswouldbeusefulinordertoplayagainstinthisgame.
RelatedWork
Tommyetal.[3]havedemonstratedthatasuitableoptimalAIalgorithmisstillunknownfortheConnectFourgame,theyappliedtwodifferentAIalgorithms;onethatmainlyutilizesthealpha-betapruningalgorithm,andoneutilizestheMTD(f)Algorithm(Memory-enhancedTestDriverAlgorithm).
Theyconsiderintheirresearchtheoptimality,whichisthewinningpercentageingeneral,thespeed,whichisrelatedtotheexecutiontime,andfinallythenumberofleafnodes.TheirresultsstatethattheirAImodelswininlessthan50%ofgameswithareasonableexecutiontime.
Sarhanetal.[4]havepresentedadesignthatconvertsthestandardConnectFourgameintoareal-timegamebyincorporatingtimerestrains.Theyhavedesignedtheirartificialintelligencemodelbasedoninfluencemapping.Intheirwork,awaterfall-basedAIsoftwarehasbeendevelopedforthegame.TheirmainresultwastosuccessfullydesigntheirsoftwareusingC++programminglanguage.
Finally,Galli[5],whoisaprominentonlineinstructivevideosmakeraboutdifferenttopicsincomputersciencehasdiscussedanimplementationofanAImodelfortheConnectFourproblemandtheimplementationofthegameitself.Inourprojecthere,weinsomeinstances,usesimilartechniquesthatareusedinhislectures.Thisisbecauseoftheirrelatively-highefficiency.
TheGeneralEmployedAITechniques
Inthissection,wedemonstrateabriefsummaryofthetheoryinofthemaintechniquesusedinourAImodelforthisproject.
TheMinimaxAlgorithm
WeimplementanAIthatmainlyemploystheminimaxalgorithmthatwehavelearnedintheclass.Thealgorithmissimplyimplementedbymakingtheplayerstrytooptimizesomeutilityfunctionthattheyhave.
Oneplayerwilltrytomaximizetheirutilityfunction,whichwewillcallthemaximizingplayer.Andtheotherplayerwilltrytominimizetheirutilityfunction,whichwewillcalltheminimizingplayer.
Wecanchoosetheutilityfunctionsforeachplayertobethecomplementoftheother,sothat,wewouldhaveazero-sumgamethatiseasierandmorefeasibletobeimplementedusingminimaxalgorithm.Figure2showsthealgorithmfromagame-theoreticperspective.
Figure2:Gametheoreticperspectiveoftheminimaxobjectivefunction.(Source[6].)
Fortheexactwaythealgorithmworks,itisbettertoillustrateusinganexample;considerthesimpleexampleinFigure3.
Figure3:Asimpleexamplefortheminimaxalgorithm(Source[7].)
Here,weconsiderasimplegamewithasimplecasewherethereareeightpossiblefinalsituationsofthegame,whereeachoneisassignedtoautilityfunctionthatthefirstplayer(thewhiteplayer)triedtomaximize,andthesecondplayer(theblackplayertriedtominimize.)
Thealgorithmevaluateseachleafnodeusingaheuristicevaluationfunctionthatwecalltheutilityfunction,obtainingthevaluesshown.Themoveswherethemaximizingplayerhasadvantageareassignedwithpositivenumbers,whilethemovesthatleadtoasituationwheretheminimizingplayerhasadvantageareassignedwithnegativenumbers.Asmuchthemagnitudesofthenumbersbecomelarger,asmuchtheadvantageprevailstowardsoneside.ApseudocodeofthealgorithmisprovidedinFigure4.
Figure4:Pseudocodefortheminimaxalgorithm(Source[7].)
InourConnectFourcase,thesearchtreegrowsexponentiallywithabaseofnearlyseven,thismakesitverycomputationallyexpensivetomakeouralgorithmexplorethetreeuntilitreachesthelastleafnodes.Hence,thenewideainsectionIV.IIisintroduced.
TheAlpha-BetaPruningAlgorithm
Asclearfromourpreviousanalysisoftheminimaxalgorithm,itwouldbeverycomputationallyexpensivetosearchthroughthewholedepthofthetreebecauseoftheexponentially-increasingnumberofleafs.Henceaveryfeasiblewaytoreducethesearchspaceisbytrimmingapartofthetreeofthepossiblesituationsofthegame.
Alpha–BetaPruningcanbeconsideredasageneralsearchalgorithmwhichhasanobjectiveofminimizingthenumberofexplorednodesthathaveutility-functionvaluescalculatedbythemainheuristicfunctionevaluatorthatisemployedintheminimaxalgorithm.
Alpha-BetaPruningalgorithmisanadversarialsearchalgorithmthatisusuallyemployedtoimplementanagentplayingtwo-playergame,whichmakesitaverygreatfitforthepurposeofimplementingandAIfortheConnectFourboardgame.
Theideabehindthealgorithmisthatitterminatedtheevaluationoftheutilityfunctionvaluesofthenodeswhenatleastonepossibilitygetsfoundthatprovesthattheplayininquiryisnotbetterthanaplaythathasalreadybeenpreviouslyexamined.Suchplaysdonotneedtohavetheirutilityfunctionvaluescalculatedfurther.
WhenappliedtheAlpha-BetaPruningalgorithmonastandardminimaxsearchtree,itreturnsaresultofthesamemoveastheminimaxwouldreturnwithoutapplyingthealgorithm,butanewtrimmedtreewillbeconsideredthathassomebranches,whichareimpossibletopossiblyaffectthefinaldecisionofthealgorithm,cutout.Hence,amoretractableproblemwillbeconsideredratherthanthemaincomputationally-expensiveoriginalone.Figure5depictsanexampleofanapplicationfortheAlpha-BetaPruningalgorithm.
Figure5:Asimpleexampleforthealpha-betapruningalgorithm(Source[7].)
InFigure5,anexamplewhereitisnotimportanttocalculatetheutilityfunctionofthreepossiblesituationsoutofeightbecause,afterwehaveevaluatedtheotherfivealready,webecamesurethatwhatevervaluesthose
leafsmayhave,thatwillnotaffectthefinalresultofthemainminimaxalgorithminanyway.
Wecaneasilyobservethat,whenlookingattheleftsubtree,whateverthevalueofthefourthleafisassigned,theweightnodeaboveitwillnoteverhaveavaluelessthanfive.Thismeanssurelythattheblacknodeaboveitwillinheritthevaluefromtheleftchild,whichhasavalueofthree,insteadofinheritingitfromthechildontheright,whichwillnothaveavaluelessthanthree,letalonefive.Thesameexactprincipleisappliedontherightsubtreethatgetsmorenodestrimmedatonceaftertheinformationofthefirstfiveleafshavealreadybeendiscovered.
Figure6showsthepseudocodeofthestandardalpha-betapruningalgorithmappliedtotheminimaxalgorithm.
Figure6:PseudocodefortheAlpha-BetaPruningalgorithm(Source[7].)
TheUtility-FunctionHeuristicEvaluator
Determiningtheutilityfunction,orthefunction’svalueevaluator,isthehardestpartofthesolutionforthegamebecauseofitsheuristicnature.This
majorhindrancecouldhavebeenavoidedifwecanefficientlyreachtheendnodes(i.e.,theleafs).Inthiscase,wewouldjustassigntheendnodestoeither
+∞or−∞dependingonwhowinsonthoseendleafs.
However,inourcasehere,wehavetohavesomekindofheuristicevaluationofwhoseadvantagesomestateofthegameis,andinwhatdegreeroughlyitisadvantageous(ordisadvantageous).Hence,constructingtheheuristicutility-functionhassomedegreeofarbitrariness.
IntheimplementationofourAI,Istartedfirstlyconsideringthenumberofdiscsofeachcolorwithinaspecificdomain(a4×4windowinthiscasehere)thatmayenduptoformawinningsituation,whetherbylyingdownhorizontally,vertically,ordiagonally.
Furthermore,averygreatadditiontothisheuristicfunctionthatIhavedonewasmadebydistinguishingthemiddlecolumnandgivingitmoreadvantageoverothercolumns.Thisadditionhasanotherbenefittoo,whichisbreakingtheinitialsymmetrythatwehaveinthestartofthegame.Thatisit,whenevertheAIstartswiththefirstturn,itchoosesthemiddlecolumnalways,anditgivesthemiddlecolumnmoreadvantageoverothercolumns.
Now,considerFigure7next:
Figure7:Theprimitiveheuristicutilityfunctiontoevaluatethescoreofaboard.
Thisismymainchosenwaytocalculatetheutilityfunction.Asclearfromthecode,whenthecountofthenumberoftheplayer’sdiscswithinacertaingivendomainofspacestobeconsidered(ourcertaindomainhereisthe4×4gridaroundtheaddeddisc.)
Ouraffectingpartsandtheirdegreesarechosenasfollowing:
Youget+99999pointsofutilitywheneveryourdiscinyourhandcompletesacountofexactlyfourdiscsofyourswithintheconsidereddomain,whichis4×4gridinthiscase(i.e.,afterperformingawinningsituation.)
Youget+5pointsofutilitywheneveryourdiscinyourhandcompletesacountofexactlythreediscsofyoursandoneemptyspacewithintheconsidereddomain,whichis4×4gridinthiscase.
Youget+2pointsofutilitywheneveryourdiscinyourhandcompletesacountofexactlytwodiscsofyoursandtwoemptyspaceswithintheconsidereddomain,whichis4×4gridinthiscase.
Youget−999pointsofutilitywheneveryourdiscinyourhandleavesbehinditacountofexactlythreediscsofyouropponent’sandoneemptyspaceswithintheconsidereddomain,whichis4×4gridinthiscase.
Thefactorinarbitrarinessinourchoicesisapparentinthechoiceoftheexactvaluesandintheexactscenarios.However,theexperimentalresultsshowninsectionVIverifythattheyareacceptabletosomeextent.
Furthermore,wedidnothavetoconsidermoreinvolvedcasesandassignvaluestothem,likeforexampleconsidering5×5gridsofdomainwithmorecasessothattheheuristicfunctionbecomesmoresophisticated.
Ourchoiceto+99999pointsofutilityisobvioustobehappeningincasethatthemoveisawinningmove,andthechoiceof−999pointsofutilityistooobvioustobehappeningincasethatthemoveisalosingmove(i.e.,theopponentcanwininasinglemoveafterit.)Inthesametime,itisimportanttomaketheAIpreferplayingawinningmoveoveravoidingalosingmove.Thisiswhywechose|99999|>|999|ofutilitypointsinthetwocases.
Now,consideringFigure8:
Figure8:Thesecondpieceofcoderegardingtheutilityfunction.
Here,thepracticalimplementationoftheconsidereddomainconsideringallthepossibleorientationsofthediscs;thehorizontal,vertical,andthetwodiagonals.Andaveryimportantpartistheonethatgivesmoresignificanceofreservingthemiddlecolumn(Iassumedthatthenumberofcolumnshereisodd.)
TheImplementation
Inthissection,wediscussbrieflythehighlevelofsomepartsofourimplementationofthecode,Ididwritethecodeheavilycommented,andImadethenamesofthevariablesrepresentativetowhattheymean.Someideasfromthecodewereinspiredorappliedfrom[5–9].
TheBasicPanelandItsUser-FriendlyInterface
Startingwithaverybriefdescriptionoftheuser-friendlyinterfacetobuildthepanelitself,westart,bysimplyanddirectlyapplyingthestraightforwardinstructionsinthedocumentationofthepygamelibrary[10].Figure9showsapartfromthatimplementation.
Figure9:Apieceofcodefortheinitializationofthepanelandthegame.
ThispartismainlyaboutdefiningthesimpleRGBarraysofeachcolortorepresentitlaterintheuser-friendlyinterface.
Furthermore,asDr.Wanghassuggested,Imademycodeflexibletoacceptmoregeneralcasesofchoosingwhatevernumberofrowsandcolumns.Figure10showstheuser-interfaceaftersettingtheNumber_of_rowsvariableto4,andtheNumber_of_columnsvariableto5.
Figure10:AgeneralpanelthatcanbeproducedbysettingtheNumber_of_rowsandNumber_of_columnsvariables.
Itisworthmentioninghere,thattheAIworksveryperfectlyindifferentsettingsofthenumberofcolumnsandrows,becausethenumberofcolumns
androwsthataresetintheinitializationdesignatedvariablesaredirectlylinkedintheimplementationoftheAIitself.Inadditiontothat,itisworthtobehighlightedthatthegameatthefirstplacewasjustacommandlinegamethatjustprintsthematrixoftheoccupationoftheslotsofthepanelaftereachstepratherthanshowingourfinalcoloredpanel.
TheImplementationoftheAIAgentandSomeOtherPartsoftheCode
AbriefclarificationoftheimplementationoftheAIplayer.Figure11showsthemainpartoftheAIalgorithmwhichconstitutesthemainbodyoftheAI.
Figure11:ThemainfunctionoftheAIagent.
Asclear,thefunctionisrecursive,andwastakenfromthepseudocodesimplementedinFigure4andFigure6.Theadjustableparametersarethedepth,
whichisthevariabled.Thevariablesalphaandbeta,asstatedinthestandardpseudocode,are+∞and−∞.
Asobviousinthefunction,eachinvokingofitreducesthedepthbyone(asobviousbecausethealgorithmgoesuplevelbylevelasdepictedinFigure3.)Furthermore,wheninvokingthefunctionitselfeachtime,italternatesbetweentwosubfunctionswithinit,oneforthemaximizingplayer,whichisflaggedbyatruevalueofthemaximizingPlayervariable,andonefortheminimizingplayer,whichisflaggedbyafalsevalueofthemaximizingPlayervariable.
AnothersmallpieceofcodethatIwouldliketostatehereisthewayIchosetoalternatebetweentheturnsofthetwoplayers.Figure12showsthepartofcodethatalternatedtheTtvariablebetween0and1aftereachtimethelineinthefigureisinvoked.
Figure12:HowtheTtvariablealternatesbetweenthetwovaluesof0and1.
Thelastsmallpartofthecodetobehighlightedhereisfromthepygamelibrary,whichholdsthepygamewindowfor2secondsafterthegameendstoavoidthesuddendiminishofthewindowrightafterthelastmove.Figure13showsthispartofthecode.
Figure13:Thepieceofcodethatisresponsibleonholdingthepygamewindowfor2000millisecondsafterthegameends.
TheEfficiencyoftheAIModel
ThissectiongivesaverybrieffeedbackoftheefficiencyofourAImodelconsideringmanycriteria:
TheAIwithanydepthassigneddefeatsarandomplayerwhochoosestheirnextcolumnrandomlyalways.
TheAIwithouttheAlpha-BetaPruningneedsunreasonabletimetoplaywhenassignedtodepthmorethan4.
TheAIwiththeAlpha-BetaPruningneedsunreasonabletimetoplaywhenassignedtodepthmorethan6.
TheAIwithdepthvalueof5wasneverdefeatedbyrationalplayers.
TheAIwithdepthvalueof3wasdefeatedinroughlyaroundonegameoutoftenplayedagainstrationalplayers.
AnAIwithhigherdepthalwaysdefeatstheAIwithlowerdepth.
TheAIwhostartsfirstalwayswinsagainstthesameAI-modeledagentwiththesamedepth.
TheAIwithdepthvalueof2wasdefeatedinroughlyaroundtwotimesoutoftengamesplayedagainstrationalplayers.
Figure14showstheendofasamplegameplayedagainstanAIwithadepthvalueoffour.Thescreenshotwascapturedwithinthetwosecondsfreezingafteritfinis
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 2025年度私人珠宝抵押典当贷款协议
- 2025年度新能源材料研究院校企合作协议书
- 二零二五年度商铺租赁合同终止及商业设施维护协议
- 2025年度电力系统调试电力工程劳务承建合同
- 2025年度火锅加盟店加盟费及利润分配合同
- 二零二五年度变压器运输保险与安全协议
- 二零二五年度租赁房屋提前解除合同
- 二零二五年度科研机构员工劳务派遣合作协议
- 2025年度生物制品简易供货合同
- 二零二五年度项目经理聘用协议(航空航天领域项目管理)
- 现代家政导论-课件 5.1.2认识家政服务业分类
- 代理记账业务内部规范制度-代理记账业务规范
- 山东虚拟电厂商业模式介绍
- 2024-2025学年高中思想政治选择性必修2 法律与生活统编版(部编版)教学设计合集
- 第09讲二元一次方程组中的新定义题型(原卷版+解析)-2021-2022学年下学期七年级数学下册期末复习高频考点专题(人教版)
- 概算审核服务投标方案(技术方案)
- 《帝国的崩裂:细说五代十国史》随笔
- 全国职业院校技能大赛高职组(商务数据分析赛项)备赛试题库(含答案)
- 2025届陕西省普通高中学业水平选择性考试 政治试卷(含答案 )
- 八年级道德与法治下册 第三单元 人民当家作主教案 新人教版
- Unit+4+Sports+Getting+Started 高中英语上外版必修第二册
评论
0/150
提交评论