《信息科学类专业英语》课件第19章_第1页
《信息科学类专业英语》课件第19章_第2页
《信息科学类专业英语》课件第19章_第3页
《信息科学类专业英语》课件第19章_第4页
《信息科学类专业英语》课件第19章_第5页
已阅读5页,还剩47页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

Lesson19SmartRooms

(第十九课智能房间)

Vocabulary(词汇)ImportantSentences(重点句)QuestionsandAnswers(问答)Problems(问题)

Increatingcomputersystemsthatcanidentifypeopleandinterprettheiractions,researchershavecomeonestepclosertobuildinghelpfulhomeandworkenvironments

Imagineahousethatalwaysknowswhereyourkidsareandtellsyouiftheyaregettingintotrouble.Oranofficethatseeswhenyouarehavinganimportantmeetingandshieldsyoufrominterruptions.Oracarthatsenseswhenyouaretiredandwarnsyoutopullover.Scientistshavelongtriedtodesigncomputersystemsthatcouldaccomplishsuchfeats.Despitetheirefforts,modernmachinesarestillnomatchforbaby-sittersorsecretaries.Buttheycouldbe.

Theproblem,inmyopinion,isthatourcurrentcomputersarebothdeafandblind:theyexperiencetheworldonlybywayofakeyboardandamouse.Evenmultimediamachines,thosethathandleaudiovisualsignalsaswellastext,simplytransportstringsofdata.Theydonotunderstandthemeaningbehindthecharacters,soundsandpicturestheyconvey.Ibelievecomputersmustbeabletoseeandhearwhatwedobeforetheycanprovetrulyhelpful.Whatismore,theymustbeabletorecognizewhoweareand,asmuchasanotherpersonorevenadogwould,makesenseofwhatwearethinking.

Tothatend,mygroupattheMediaLaboratoryattheMassachusettsInstituteofTechnologyhasrecentlydevelopedafamilyofcomputersystemsforrecognizingfaces,expressionsandgestures.Thetechnologyhasenabledustobuildenvironmentsthatbehavesomewhatlikethehouse,officeandcardescribedabove.Theseareas,whichwecallsmartrooms,arefurnishedwithcamerasandmicrophonesthatrelaytheirrecordingstoanearbynetworkofcomputers.Thecomputersassesswhatpeopleinthesmartroomaresayinganddoing.Thankstothisconnection,visitorscanusetheiractions,voicesandexpressions-insteadofkeyboards,sensorsorgoggles-tocontrolcomputerprograms,browsemultimediainformationorventureintorealmsofvirtualreality.[1]Thekeyideaisthatbecausethesmartroomknowssomethingaboutthepeopleinit,itcanreactintelligentlytothem.WorkingtogetherwithPattieMaesandme,graduatestudentsTrevorDarrellandBruceM.Blumbergconstructedthefirstsmartroomin1991atM.I.T.Theinitiativequicklygrewintoacollaborativeexperimentandnowinvolvesfivesuchrooms,alllinkedbytelephonelines,aroundtheworld:threeinBoston,oneinJapanandoneintheU.K.(InstallationsarealsoplannedforParis,NewYorkCityandDallas.)

Eachroomcontainsseveralmachines,nonemorepowerfulthanapersonalcomputer.Theseunitstackledifferentproblems.Forinstance,ifasmartroommustanalyzeimages,soundsandgestures,weequipitwiththreecomputers,oneforeachtypeofinterpretation.Ifgreatercapabilitiesareneeded,weaddmoremachines.Althoughthemodulestakeondifferenttasks,theyallrelyonthesamestatisticalmethod,knownasmaximumlikelihoodanalysis:thecomputerscompareincominginformationwithmodelstheyhavestoredinmemory.[2]Theycalculatethechancethateachstoredmodeldescribestheobservedinputandultimatelypicktheclosestmatch.Bymakingsuchcomparisons,oursmart-roommachinescananswerarangeofquestionsabouttheirusers,includingwhotheyareandsometimesevenwhattheywant.1Where?

Beforeasmartroomcanbegintofigureoutwhatpeoplearedoing,itneedstolocatethem.SograduatestudentsChristopherR.Wren,AliAzarbayejaniandDarrellandIdevelopedasystemcalledPersonFinder(Pfinderforshort)thatcantrackonepersonasheorshemovesaroundintheroom.Asdoourothersystems,Pfinderadoptsthemaximumlikelihoodapproach.First,itmodelsthepersonthecamerarecordsasaconnectedsetofblobs-twoforthehands,twoforthefeetandoneeachforthehead,shirtandpants.Itdescribeseachblobintwoways:asadistributionofvaluesfortheblob’scolorandplacement,andasaso-calledsupportmap,essentiallyalistindicatingwhichimagepixelsbelongtotheblob(pixelsare“pictureelements,”similartothedotsthatmakeupatelevisionimage).[3]Next,Pfindercreatestexturedsurfacestomodelthebackgroundscene.Eachpointononeofthesesurfacescorrelatestoanaveragecolorvalueandadistributionaroundthatmean.Wheneverthecamerainthesmartroompicksupanewpictureinthevideostream,Pfindercomparesthatimagewiththemodelsithasmadeandwithotherreferencesaswell.Tostart,thesystemguesseswhattheblobmodelshouldlooklikeinthenewimage.If,forexample,aperson’supperbodywasmovingtotherightatonemeterpersecondatenthofasecondago,thenPfinderwillassumethatthecenteroftheupperbodyblobhasmovedatenthofametertotheright.Suchestimatesarealsocheckedagainsttypicalpatternsofmovementthatwehavederivedfromtestingthesystemonthousandsofpeople.Forinstance,weknowthatblobscorrespondingtothetorsomustmoveslowly,whereasthoserelatingtohandsandfeetgenerallymovemuchfaster.

Predictionsfinished,Pfinderthenmeasuresthechancethateachpixelinthenewimagebelongstoeachblob.Itdoessobysubtractingthepixel’scolorandbrightnessvaluesfromeachblob’smeancolorandbrightnessvalues.Itcomparestheresultwitheachblob’sdistributiontodeterminehowlikelyitisthatthedifferencehappenedbychance.If,forexample,thebrightnessdifferencebetweenapixelandablobwere10percent,andtheblob’sstatisticssaidthatsuchadifferencehappenedonly1percentofthetime,thechancethatthepixelbelongedtotheblobwouldbeamereonein100.[4]Shadowspresentaminorprobleminthattheycausebrightnessdifferencesthathavenothingtodowiththeprobabilitythatsomepixelbelongstosomeblob.SoPfindersearchesoutshadows,areasthataredarkerthanexpected,andevensouttheircolorhueandsaturationusingthearea’soverallbrightness.

Pfindermustalsoovercomeslightchangesinthelightingorarrangementofobjectsintheroom,eitherofwhichmightmakeitplacecertainpixelsinthewrongmodels.Toavoidthisdifficulty,thesystemcontinuouslyupdatesthepixelsthatarevisiblebehindtheuser,averagingtheoldcolorinformationwiththenew.Inthisway,itkeepstrackofchangesthatoccur,forinstance,whentheusermovesabookandthusaltersthesceneintwoplaces:wherethebookwasandwhereitnowis.Aftercompletingthesevariouscalculationsandcompensations,Pfinderatlastassignseachpixelinthenewimagetothemodelthatmostlikelycontainsit.Finally,itupdatesthestatisticsdescribingtheblobmodelandthebackgroundscene,aswellasthoseanticipatingwhichwaytheblobswillmove.2WhoandHow?

Asidefromknowingwherepeopleare,asmartroommustalsoknowwhotheyareandwhattheyaresaying.Manyworkershaveinventedalgorithmsthatallowcomputerstounderstandspeech.Virtuallyallthosesystemsworkwellonlywhentheuserwearsamicrophoneorsitsnearone.Aroomthatinterpretedyouractionsonlywhenyoustoodinaparticularspotwouldnotseemsosmart.SograduatestudentsSumitBasuandMichaelCaseyandIlookedforanothersolution—onethatwouldletacomputerdecodeauser’sspeechasheorshemovedfreelyaboutsomeroom,eveniftheroomwerequitenoisy.OurendproducttakesadvantageofthefactthatPfinderfollowstheuser’spositionatalltimes.Borrowingthisinformation,thespeech-recognitionsystemelectronically“steers”anarrayoffixedmicrophonessothattheyreinforceonlythosesoundscomingfromthedirectionoftheuser’smouth.[5]Itisaneasyjob.Becausesoundtravelsatafixedspeed,itarrivesatdifferentlocationsatslightlydifferenttimes.Soeachsoundlocationyieldsadifferentpatternoftimedelays.Thus,ifthesystemtakestheoutputsfromafixedarrayofmicrophonesandaddsthemtotimedelaysthatcharacterizeacertainlocation,itcanreinforcethesoundfromthatlocation.Thenitneedonlycomparethesoundwiththoseofknownwordsuntilamatchisfound.

Asmartroommustalsoknowwhoisspeakinginitortoit.Toactwithseemingintelligence,itisabsolutelyvitalthatasystemknowitsusers’identity.Whogivesacommandisoftenasimportantasthecommanditself.Thefastestwaytoidentifysomeonemaywellbetorecognizehisorherface.Sowedevelopedasystemforourroomstodojustthat.Toemploythemaximumlikelihoodapproach,thissystemfirstneededtobuildmodelsofallthefacesit"knew."WorkingwithM.I.T.graduatestudentsMatthewA.TurkandBabackMoghaddam,wefoundthatitwasimportanttofocusonthosefeaturesthatmostefficientlydescribedanentiresetoffaces.Weusedamathematicaltechniquecalledeigenvectoranalysistodescribethosesets,dubbingtheresults“eigenfaces”.Tomodelaface,thesystemdeterminedhowsimilarthatfacewastoeacheigenface.

Thestrategyhasworkedwell.Whenthecameradetectsaperson,theidentifyingsystemextractshisorherface-locatedbyPfinder-fromthesurroundingsceneandnormalizesitscontrast.Thesystemthenmodelsthefaceintermsofwhatsimilaritiesitbearstotheeigenfaces.Next,itcomparesthemodelwiththoseofknownpeople.Ifanyofthesimilarityscoresareclose,thesystemassumesthatithasidentifiedtheuser.Usingthismethod,oursmartroomshaveaccuratelyrecognizedindividualfaces99percentofthetimeamidgroupsofseveralhundred.

Facialexpressionisalmostasimportantasidentity.Ateachingprogram,forexample,shouldknowifitsstudentslookbored.Soonceoursmartroomhasfoundandidentifiedsomeone’sface,itanalyzestheexpression.Yetanothercomputercomparesthefacialmotionthecamerarecordswithmapsdepictingthefacialmotionsinvolvedinmakingvariousexpressions.Eachexpression,infact,involvesauniquecollectionofmusclemovements.Whenyousmile,youcurlthecornersofyourmouthandliftcertainpartsofyourforehead;whenyoufakeasmile,though,youmoveonlyyourmouth.InexperimentsconductedbyscientistIrfanA.Essaandme,oursystemhascorrectlyjudgedexpressions-amongasmallgroupofsubjects-98percentofthetime.3What?

Recognizingaperson’sface,expressionandspeechisjustthefirststep.Forhouses,officesorcarstohelpus,theymustbeabletoputthesebasicperceptionsincontext.Thesamemotions,afterall,canbeinterpretedquitedifferentlydependingonwhatthepersonmakingthemintends.Whenyoudriveacar,forexample,yousometimestakeyourfootfromthegaspedalbecauseyouwanttoslowdown.Butyoudothesamewhenyougetreadytomakeaturn.Thedifferenceisthatinpreparingforaturn,youadjustthesteeringwheelasyoumoveyourfoot.Soacomputersystemwouldneedtoconsiderhowyourmovementshadchangedovertime,incombinationwithothermovements,toknowwhatyouweredoingatanyonemoment.

Indesigningsuchasystem,weborrowedideasfromthescientistsworkingonspeechrecognition.Theymodelindividualwordsassequencesofsounds,or,astheycallthem,internalstates.Eachwordhasacharacteristicdistributionofinternalstates,whicharesometimesphonemes(thesmallestdistinguishableunitsofspeech)andsometimesjustpartsofphonemes.Acomputersystemtriestoidentifywordsbycomparingthesequenceofsoundstheycontainwithwordmodelsandthenselectingthemostlikelymatches.

Wegeneralizedthisapproachinthehopeofdeterminingpeople’sintentionsfromtheirmovements.Wedevisedacomputersystemthatcantell,forexample,whetherapersonwithonearmextendedispointingormerelystretching.Thesystemrecognizestheactioninvolvedinpointingbyreferringtoamodelhavingthreeinternalstates:raisethehand,holditsteadyandreturnitquickly.Thesystemseesstretchingasonecontinuousmovement.Sobyobservingtheseinternalstates-characterizedbytheaccelerationofthehandandthedirectionofitsmovement-oursystemworksoutwhatsomeoneisdoing.

Todate,wehavebuiltseveraldifferentsystemsforinterpretinghumanactionsinthisway.Thesimplestallowpeopletousetheirbodytocontrolvirtualenvironments.OnesuchapplicationistheArtificialLifeInteractiveVideoEnvironment(ALIVE),ajointprojectofMaes’sgroupandmyown.ALIVEutilizesthesmartroom’sdescriptionoftheuser’sshapetoplaceavideomodeloftheuserintoavirtual-realityscene,wherecomputer-generatedlife-formsreside.Thesevirtualcrittersanalyzeinformationaboutauser’sgestures,soundsandpositionstodecidehowtointeractwithhimorher.Silasthevirtualdog,forexample,playsfetch.

Whenasmart-roomusermimicsthemotionsinvolvedinpickingupandthrowingSilas’svirtualball,thedogseesthevideoimageintheALIVEenvironmentdothesameandgetsreadytochaseafteritstoy.Silasalsositsandrollsoveroncommand.Thesmartroom’soutputcanbeputtoworkinanevenmoredirectmanner.Theuser’sbodypositioncanbemappedintoacontrolspaceofsortssothathisorhersoundsandgestureschangetheoperatingmodeofacomputerprogram.Gameplayers,forexample,haveusedthisinterface,insteadofajoystickortrackball,tonavigatethree-dimensionalvirtualenvironments.Ifopponentsappearontheleft,theplayerneedonlyturntothelefttofacethem;tofireaweapon,theplayerneedonlysay,“Bang.”4Why?

Virtual-realitygamesaside,manymorepracticalapplicationsofsmart-roomtechnologyexist.ConsiderAmericanSignLanguage(ASL),asetofsophisticatedhandgesturesusedbydeafandmutepeople.Becausethegesturesarequitecomplex,theyofferagoodtestofourroom’sabilities.Hence,graduatestudentThadStarnerandIsetouttobuildasystemforinterpretingASL.Wefirstbuiltmodelsforeachsign,observingmanyexamplesofthehandmotionsinvolved,asdescribedbyPfinder.WefoundthatifwecomparedthesemodelswithPfinder’smodelsofanactualuserwhileheorshewassigning,wecouldtranslatea40-wordsubsetofASLinrealtimewithanaccuracyrateof99.2percent.Ifwecanincreasethesizeofthevocabularythatoursystemunderstands—anditseemsverylikelythatwewillbeabletodoso—itmaybepossibletocreateinterfacesfordeafpeopleasreliableasthespeech-recognitionsystemsthatarenowbeingintroducedforpeoplewhocanhear.[6]Automobiledrivers,too,standtobenefitfromsmart-roomtechnology.InmanypartsoftheU.S.,theaverageworkerspends10hoursaweekinacar.Morethan40,000motoristsdieintrafficaccidentseachyear,themajorityofwhichcanbeattributedtodrivererror.SotogetherwithAndyLiu,ascientistatNissanCambridgeBasicResearch,wehavebeenbuildingasmart-roomversionofacarinterior.Theultimategoalistodevelopavehiclethatcanmonitorwhatthedriverisdoingandprovideusefulfeedback,suchasroaddirections,operatinginstructionsandeventravelwarnings.Tocompileasetofdrivingmodels—includingwhatactionspeopletookwhentheywerepassing,following,turning,stopping,acceleratingorchanginglanes—weobservedthehandandlegmotionsofmanydriversastheysteeredtheirwaythroughasimulatedcourse.Weusedtheresultingmodelstoclassifyatestdriver’sactionasquicklyaspossible.Surprisingly,thesystemcoulddeterminewhatthedriverwasdoingalmostassoonastheactionhadstarted.Itclassifiedactionswithanaccuracyof86percentwithin0.5secondofthestartofanaction.Giventwoseconds,theaccuracyroseto97percent.

Wehaveshownthat,atleastinsimplesituations,itispossibletotrackpeople’smovements,identifythemandrecognizetheirexpressionsinrealtimeusingonlymodestcomputationalresources.[7]Bycombiningsuchcapabilities,wehavebuiltsmartroomsinwhich,freefromwiresorkeyboards,individualscancontrolcomputerdisplays,playwithvirtualcreaturesandevencommunicatebywayofsignlanguage.Suchperceptualintelligenceisalreadybeginningtospreadtoawidervarietyofsituations.Forinstance,wearenowbuildingprototypesofeyeglassesthatrecognizeyouracquaintancesandwhispertheirnamesinyourear.Wearealsoworkingontelevisionscreensthatknowwhenpeoplearewatchingthem.Andweplantodevelopcreditcardsthatcanrecognizetheirownersandsoknowwhentheyhavebeenstolen.

OtherresearchgroupsattheMediaLabareworkingtograntoursmartroomstheabilitytosenseattentionandemotionandtherebygainadeeperunderstandingofhumanactionsandmotivations.RosalindW.Picardhopestodeviseasystemthatcantellwhendriversorstudentsarenotpayingattention.AaronBobickiswritingsoftwaretointerpretthehumanmotionsusedinsports—imagineatelevisioncamerathatcoulddiscriminatebetweentwofootballplays,say,aquarterbacksneakandanendrun,andfollowtheaction.Assmart-roomtechnologydevelopsevenfurther,computerswillcometoseemmorelikeattentiveassistantsthaninsensibletools.Infact,itisnottoofar—fetchedtoimagineaworldinwhichthedistinctionbetweeninanimateandanimateobjectsactuallybeginstoblur.

1. kidn.哄骗,取笑,开玩笑,小孩,小山羊v.哄骗,取笑,开玩笑,欺骗。

2. baby-sittern.代人临时照看婴孩者。

3. audiovisualn.(常用复数)视听设备,视听教材adj.视听的。

4. gogglen.眼睛睁视,(复数)风镜,护目镜adj.睁眼的,瞪眼的;vi.眼珠转动,瞪眼看vt.使瞪眼。

5. tacklen.工具,滑车,用具,装备,扭倒vt.固定,应付(难事等),处理,抓住vi.捉住,扭住。Vocabulary

6. statisticaladj.统计的,统计学的。

7. blobn.一滴,水滴,斑点vt.溅污。

8. texturedadj.织地粗糙的,手摸时有感觉的,有织纹的。

9. correlatevt.使相互关联vi.和……相关。

10. torso未完成的(不完整的)作品,残缺不全的东西。

11. huen.色调,样子,颜色,色彩;叫声,大声叫喊,大声反对。

12. saturationn.饱和(状态),浸润,浸透,饱和度。

13. compensationn.补偿,赔偿。

14. anticipatevt.预期,期望,过早使用,先人一着,占先v.预订,预见,可以预料。

15. pedaln.踏板; v.踩……的踏板。

16. phonemen.[语]音位,音素。

17. dimensionaladj.空间的。

18. prototypen.原型。

19. insensibleadj.无知觉的,无同情心的,硬心肠的,麻木不仁的。

20. blurv.涂污,污损(名誉等),把(界线、视线等)弄得模糊不清,弄污n.污点。

[1]Thecomputersassesswhatpeopleinthesmartroomaresayinganddoing.Thankstothisconnection,visitorscanusetheiractions,voicesandexpressions-insteadofkeyboards,sensorsorgoggles-tocontrolcomputerprograms,browsemultimediainformationorventureintorealmsofvirtualreality.

计算机来判断智能房间中的人在说什么,做什么。由于有了这个媒介,来访者可以使用动作、声音和表情,而不是键盘、传感器或视镜,来控制计算机程序、浏览多媒体信息,或是在虚拟现实世界里探险。注意破折号中间的插入语“insteadofkeyboards,sensorsorgoggles”,第二句句子框架为:VisitorscanuseAtodosomething。ImportantSentences

[2]Althoughthemodulestakeondifferenttasks,theyallrelyonthesamestatisticalmethod,knownasmaximumlikelihoodanalysis:thecomputerscompareincominginformationwithmodelstheyhavestoredinmemory.

尽管这些模块有不同的任务,但它们都依赖同样的统计学方法——极大似然法,计算机把输入的信息和存储器中已有的模型进行匹配。“maximumlikelihoodanalysis”可译为“最大似然法”。

[3]Itdescribeseachblobintwoways:asadistributionofvaluesfortheblob’scolorandplacement,andasaso-calledsupportmap,essentiallyalistindicatingwhichimagepixelsbelongtotheblob(pixelsare“pictureelements”,similartothedotsthatmakeupatelevisionimage).

它用两种方法描述每一个块的属性:一种方法是描述这个块的颜色和位置的值的分布,另一种方法是所谓的支持图,本质上是一张表,它描述哪一个图像像素属于这个块(像素是“图像元素”,这同组成电视图像中的点相似)。“essentiallyalistindicatingwhichimagepixelsbelongtotheblob”为“supportmap”的同位语。

[4]If,forexample,thebrightnessdifferencebetweenapixelandablobwere10percent,andtheblob’sstatisticssaidthatsuchadifferencehappenedonly1percentofthetime,thechancethatthepixelbelongedtotheblobwouldbeamereonein100.

举例来说,如果一个像素和一个块之间的亮度差为10%,而且块的统计数值说明了当时这个差的发生率仅为1%,则这个像素属于这个块的机会只有1%。注意“If,forexample,the…”句子结构的使用。

[5]Borrowingthisinformation,thespeech-recognitionsystemelectronically“steers”anarrayoffixedmicrophonessothattheyreinforceonlythosesoundscomingfromthedirectionoftheuser’smouth.

借助于这种信息,语音识别系统电子化的“控制”由固定麦克风组成的矩阵,加强从用户嘴巴方向传来的声音。注意理解带引号的谓语动词“steer”的含义,该词的字面含义为“引导,驾驶”。

[6]Ifwecanincreasethesizeofthevocabularythatoursystemunderstands—anditseemsverylikelythatwewillbeabletodoso—itmaybepossibletocreateinterfacesfordeafpeopleasreliableasthespeech-recognitionsystemsthatarenowbeingintroducedforpeoplewhocanhear.

如果我们能往系统里加入系统可识别的词汇——很有可能我们会这样做——我们就有可能开发出与目前为正常人开发的接口一样可靠的聋哑人语音识别系统。注意破折号中间的成分为“插入语”,在分析句子结构时可忽略其存在。

[7]Wehaveshownthat,atleastinsimplesituations,itispossibletotrackpeople’smovements,identifythemandrecognizetheirexpressionsinrealtime

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论