2024年AI现状报告-2024年10月版（英文）-Air+Street资本

上传人：策*** IP属地：山西上传时间：2024-10-17 格式：DOCX 页数：426 大小：12.59MB 积分：19.9 举报 版权申诉

2024年AI现状报告-2024年10月版（英文）-Air+Street资本_第2页

2024年AI现状报告-2024年10月版（英文）-Air+Street资本_第3页

2024年AI现状报告-2024年10月版（英文）-Air+Street资本_第4页

2024年AI现状报告-2024年10月版（英文）-Air+Street资本_第5页

已阅读5页，还剩421页未读，继续免费阅读

版权说明：本文档由用户提供并上传，收益归属内容提供方，若内容存在侵权，请进行举报或认领

文档简介

venturecapitalﬁrminvestinginAI-ﬁrstcompanies.HerunstheResearchandAppliedAISummit(RAAIS),theRAAISFoundation(fundingopen-sourceAIprojects),AIcommunitiesintheUSandEurope,andSpinout.fyi(improvinguniversityspinoutcreation).HestudiedbiologyatWilliamsCollegeandearnedaPhDfromCambridgeincancerresearchasaGatesScholar.regularlywritesresearch,analysis,andcommentaryonanassociatedirectoratMilltownPartners,whereheadvisedbigtechnologycompanies,start-ups,andinvestorsonpolicyandpositioning.HegraduatedfromtheUniversityofOxfordin2017withadegreeinHistory.Artiﬁcialintelligence(AI)isamultidisciplinaryﬁeldofscienceandengineeringwhosegoalistocreateintelligentmachines.WebelievethatAIwillbeaforcemultiplierontechnologicalprogressinourincreasinglydigital,data-drivenworld.Thisisbecauseeverythingaroundustoday,rangingfromculturetoconsumerproducts,isaproductofintelligence.TheStateofAIReportisnowinitsseventhyear.Considerthisreportasacompilationofthemostinterestingthingswe’veseenwithagoaloftriggeringaninformedconversationaboutthestateofAIanditsimplicationforthefuture.Weconsiderthefollowingkeydimensionsinourreport:-Research:Technologybreakthroughsandtheircapabilities.-Industry:AreasofcommercialapplicationforAIanditsbusinessimpact.-Politics:RegulationofAI,itseconomicimplicationsandtheevolvinggeopoliticsofAI.-Safety:Identifyingandmitigatingcatastrophicrisksthathighly-capablefutureAIsystemscouldposetous.-Predictions:Whatwebelievewillhappeninthenext12monthsanda2023performancereviewtokeepushonest.Artiﬁcialintelligence(AI):abroaddisciplinewiththegoalofcreatingintelligentmachines,asopposedtothenaturalintelligencethatisdemonstratedbyhumansandanimals.Artiﬁcialgeneralintelligence(AGI):atermusedtodescribefuturemachinesthatcouldmatchandthenexceedthefullrangeofhumancognitiveabilityacrossalleconomicallyvaluabletasks.AIAgent:anAI-poweredsystemthatcantakeactionsinanenvironment.Forexample,anLLMthathasaccesstoasuiteoftoolsandhastodecidewhichonetouseinordertoaccomplishataskthatithasbeenpromptedtodo.AISafety:aﬁeldthatstudiesandattemptstomitigatetherisks(minortocatastrophic)whichfutureAIcouldposetohumanity.Computervision(CV):theabilityofaprogramtoanalyseandunderstandimagesandvideo.Deeplearning(DL):anapproachtoAIinspiredbyhowneuronsinthebrainrecognisecomplexpatternsindata.The“deep”referstothemanylayersofneuronsintoday’smodelsthathelptolearnrichrepresentationsofdatatoachievebetterperformancegains.Diffusion:Analgorithmthatiterativelydenoisesanartiﬁciallycorruptedsignalinordertogeneratenew,high-qualityoutputs.Inrecentyearsithasbeenattheforefrontofimagegenerationandproteindesign.GenerativeAI:AfamilyofAIsystemsthatarecapableofgeneratingnewcontent(e.g.text,images,audio,or3Dassets)basedon'prompts'.GraphicsProcessingUnit(GPU):asemiconductorprocessingunitthatenablesalargenumbercalculationstobecomputedinparallel.Historicathiswasrequiredforrenderingcomputergraphics.Since2012GPUshaveadaptedfortrainingDLmodels,whichalsorequirealargenumberofparallelcalculations.(Large)Languagemodel(LM,LLM):amodeltrainedonvastamountsof(often)textualdatatopredictthenextwordinaself-supervisedmanner.Theterm“LLM”isusedtodesignatemulti-billionparameterLMs,butthisisamovingdeﬁnition.Machinelearning(ML):asubsetofAIthatoftenusesstatisticaltechniquestogivemachinestheabilityto"learn"fromdatawithoutbeingexplicitlygiventheinstructionsforhowtodoso.Thisprocessisknownas“training”a“model”usingalearning“algorithm”thatprogressivelyimprovesmodelperformanceonaspeciﬁctask.Model:aMLalgorithmtrainedondataandusedtomakepredictions.Naturallanguageprocessing(NLP):theabilityofaprogramtPrompt:auserinputoftenwritteninnaturallanguagethatisusedtoinstructanLLMtogeneratesomethingortakeaction.Reinforcementlearning(RL):anareaofMLinwhichsoftwareagentslearngoal-orientedbehaviorbytrialanderrorinanenvironmentthatprovidesrewardsorpenaltiesinresponsetotheiractions(calleda“policy”)towardsachievingthatgoal.Self-supervisedlearning(SSL):aformofunsupervisedlearning,wheremanuallylabeleddataisnotneeded.Rawdataisinsteadmodiﬁedinanautomatedwaytocreateartiﬁciallabelstolearnfrom.AnexampleofSSLislearningtocompletetextbymaskingrandomwordsinasentenceandtryingtopredictthemissingones.Transformer:amodelarchitectureatthecoreofmoststateoftheart(SOTA)MLresearch.Itiscomposedofmultiple“attention”layerswhichlearnwhichpartsoftheinputdataarethemostimportantforagiventask.TransformersstartedinNLP(speciﬁcallymachinetranslation)andsubsequentlywereexpandedintocomputervision,audio,andothermodalities.Intherestoftheslides,iconsinthetoprightcornerindicateinputandoutputmodalitiesforthemodel.:Softwaretooluse(text,codegeneration&execution) .:3D:Robotstate:Biologicalmodality→:TexttoSoftwaretooluse→.:Imageto3D→.:Textto3D-Frontierlabperformanceconverges,butOpenAImaintainsitsedgefollowingthelaunchofo1,asplanningandreasoningemergeasamajorfrontier.-Foundationmodelsdemonstratetheirabilitytobreakoutoflanguageasmultimodalresearchdrivesintomathematics,biology,genomics,thephysicalsciences,andneuroscience.-USsanctionsfailtostopChinese(V)LLMsrisingupcommunityleaderboards.-NVIDIAremainsthemostpowerfulcompanyintheworld,enjoyingastintinthe$3Tclub,whileregulatorsprobetheconcentrationsofpowerwithinGenAI.-MoreestablishedGenAIcompaniesbringinbillionsofdollarsinrevenue,whilestart-upsbegintogaintractioninsectorslikevideoandaudiogeneration.Althoughcompaniesbegintomakethejourneyfrommodeltoproduct,long-termquestionsaroundpricingandsustainabilityremainunresolved.-Drivenbyabullruninpublicmarkets,AIcompaniesreach$9Tinvalue,whileinvestmentlevelsgrowhealthilyinprivatecompanies.-Whileglobalgovernanceeffortsstall,nationalandregionalAIregulationhascontinuedtoadvance,withcontroversiallegislationpassingintheUSandEU.-TherealityofcomputerequirementsforcesBigTechcompaniestoreckonwithreal-worldphysicalconstraintsonscalingandtheirownemissionstargets.Meanwhile,governments’ownattemptstobuildcapacitycontinuetolag.-AnticipatedAIeffectsonelections,employmentandarangeofothersensitiveareasareyettoberealizedatanyscale.-Avibe-shiftfromsafetytoaccelerationtakesplaceascompaniesthatpreviouslywarnedusaboutthependingextinctionofhumanityneedtorampupenterprisesalesandusageoftheirconsumerapps.-GovernmentsaroundtheworldemulatetheUKinbuildingupstatecapacityaroundAIsafety,launchinginstitutesandstudyingcriticalnationalinfrastructureforpotentialvulnerabilities.-Everyproposedjailbreaking‘ﬁx’hasfailed,butresearchersareincreasinglyconcernedwithmoresophisticated,long-termattacks.stateof.ai2024AHollywood-gradeproductionmakesuseofgenerativeAIforvisualeffects.AgenerativeAImediacompanyisinvestigatedforitsmisuseduringinthe2024USelectioncircuit.Self-improvingAIagentscrushSOTAinacomplexenvironment(e.g.AAAgame,tooluse,science).TechIPOmarketsunthawandweseeatleastonemajorlistingforanAI-focusedcompany(e.g.DBRX).TheGenAIscalingcrazeseesagroupspend>$1Btotrainasinglelarge-scalemodel.TheUS’sFTCorUK’sCMAinvestigatetheMicrosoft/OpenAIdealoncompetitiongroundWeseelimitedprogressonglobalAIgovernancebeyondhigh-levelvoluntarycommitments.FinancialinstitutionslaunchGPUdebtfundstoreplaceVCequitydollarsforcomputefunding.AnAI-generatedsongbreaksintotheBillboardHot100Top10ortheSpotifyTopHits2024.Asinferenceworkloadsandcostsgrowsigniﬁcantly,alargeAIcompany(e.g.OpenAI)acquiresorbuildsaninference-focusedAIchipcompany.~~Largelybadly,butGenAIAIvisualeffectshavebeenseeninNetﬂixandHBOproductions.Notyet,butthere’sstilltime.Notyet,despitepromisingworkonopen-endedness,includingstronggameperformance.WhiletheMagniﬁcentSevenhaveenjoyedstronggains,privatecompaniesarehangingonuntilmarketssettle.However,AIchipcompanyCerebrashasﬁledtoIPO.Notquiteyet-let’sgiveitanotheryear.Bothregulatorsareinvestigatingthispartnership.ThecommitmentsfromBletchleyandSeoulsummitsremainvoluntaryandhigh-level.SomeVCfundsarerumoredtobeofferingGPUsforequity,butwe’reyettoseeanyonegodownthedebtroute.Itturnsoutthishadalreadyhappenedlastyearwith“HeartonMySleeve”,butwe’vealsoseenanAI-generatedsongreach#27inGermanyandspendseveraldaysintheTop50.SamAltmanisreportedlyraisinghugesumsofmoneytodothis,whileeachofGoogle,Amazon,MetaandMicrosoftcontinuetobuildandimprovetheirownedAIsilicon.Introduction|Research|Industry|Politics|Safety●●Onbothformalbenchmarksandvibes-basedanalysis,thebest-fundedfrontierlabsareabletorackupscoreswithinlowsingledigitsofeachotheronindividualcapabilities.Modelsarenowconsistentlyhighlycapablecoders,arestrongatfactualrecallandmath,butlessgoodatopen-endedquestion-answeringandmulti-modalproblemsolving.Manyofthevariationsaresufﬁcientlysmallthattheyarenowlikelytobetheproductofdifferencesinimplementation.Forexample,GPT-4ooutperformsClaude3.5SonnetonMMLU,butapparentlyunderperformsitonMMLU-Pro-abenchmarkdesignedtobemorechallenging.Consideringtherelativelysubtletechnicaldifferencesbetweenarchitecturesandlikelyheavyoverlapsinpre-trainingdata,modelbuildersarenowincreasinglyhavingtocompeteonnewcapabilitiesandproductIntroduction|Research|Industry|Politics|Safety●Byshiftingcomputefrompre-andpost-trainingtoinference,o1reasonsthroughcomplexpromptsstep-by-stepinachain-of-thought(COT)style,employingRLtosharpentheCOTandthestrategiesituses.Thisunlocksthepossibilityofsolvingmulti-layeredmath,science,andcodingproblemswhereLLMshavehistoricallystruggled,duetotheinherentlimitationsofnext-tokenprediction.OpenAIreportsigniﬁcantimprovementsonreasoning-heavybenchmarksversus4o,withthestarkestonAIME2024(competitionmath),withawhoppingscoreof83.83versus13.4.However,thiscapabilitycomesatasteepprice:1Minputtokensofo1-previewcosts$15,while1Moutputtokenswillsetyouback$60.Thismakesit3-4xmoreexpensivethanGPT-4o.OpenAIisclearinitsAPIdocumentationthatitisnotalike-for-like4oreplacementandthatitisnotthebestmodelfortasksthatrequireconsistentlyquickresponses,imageIntroduction|Research|Industry|Politics|SafetyIntroduction|Research|Industry|Politics|Safety●Metastucktothesamedecoder-onlytransformerarchitecturethatit’susedsinceLlama1,withminoradaptations,namelymoretransformerlayersandattentionheads.Metausedanincredible15Ttokenstotrainthefamily.Whilethisblewthroughthe“Chinchilla-optimal”amountoftrainingcompute,theyfoundthatboththe8Band70Bmodelsimprovedlog-linearlyupto15T.Llama3.1405Bwastrainedover16,000H100GPUs,theﬁrstLlamamodeltrainedatthisscale.MetafollowedupwithLlama3.2inSeptember,whichincorporated11Band90BVLMs(Llama’smultimodaldebut).TheformerwascompetitivewithClaude3Haiku,thelatterwithGPT-4o-mini.Thecompanyalsoreleased1Band3Btext-onlymodels,designedtooperateon-device.Llama-basedmodelshavenowrackedupover440MdownloadsonHuggingIntroduction|Research|Industry|Politics|SafetyIntroduction|Research|Industry|Politics|SafetyIntroduction|Research|Industry|Politics|Safety●●AteamfromtheUniversityofEdinburghﬂaggedupthenumberofmistakesinMMLU,includingthewronggroundtruth,unclearquestions,andmultiplecorrectanswers.Whilelowacrossmostindividualtopics,therewerebigspikesincertainﬁelds,suchasvirology,where57%oftheanalyzedinstancescontainederrors.OnamanuallycorrectedMMLUsubset,modelsbroadlygaininperformance,althoughworsenedonprofessionallawandformallogic.ThissaysinaccurateMMLUinstancesarebeinglearnedduringpre-training.Inmoresafety-criticalterritory,OpenAIhaswarnedthatSWE-bench,whichevaluatesmodels’abilitytosolvereal-worldsoftwareissues,wasunderestimatingtheautonomoussoftwareengineeringcapabilitiesofmodels,asitcontainedtasksthatwerehardorimpossibletosolve.Theresearcherspartneredwiththecreatorsofthebenchmarktocreate●Thearena,whichallowsuserstointeractwithtworandomlyselectedchatbotsside-by-sideprovidesaroughcrowdsourcedevaluation.receivingthesamescores,withthelatteralsooutperformingClaudeSonnet3.5.Thishasledtoconcernsthattherankingisessentiallybecomingawayofassessingwhichwritingstyleusershappentoprefermost.Additionally,assmallermodelstendtoperformlesswellontasksinvolvingmoretokens,the8kcontextlimitarguablygivesthemanunfairadvantage.However,theearlyversionofthevisionleaderboardisnowbeginningtogaintractionandalignsbetterwithotherevals.冒+回→●AGoogleDeepMind/NYUteamgeneratedmillionsofsynthetictheoremsandproofsusingsymbolicengines,usingthemtotrainalanguagemodelfromscratch.●AlphaGeometryalternatesbetweenthelanguagemodelproposingnewconstructionsandsymbolicenginesperformingdeductionsuntilasolutionisfound.●Impressively,Itsolved25outof30onabenchmarkofOlympiad-levelgeometryproblems,nearinghumanInternationalMathematicalOlympiadgoldmedalistperformance.ThenextbestAIperformancescoredonly10.●Italsodemonstratedgeneralisationcapabilities-forexample,ﬁndingthataspeciﬁcdetailina2004IMOproblemwasunnecessarytofortheproof.Introduction|Research|Industry|Politics|Safety●AMeta/MITteamlookingatopen-weightpre-trainedLLMsconcludedthatit’spossibletodoawaywithuptohalfamodel’slayersandsufferonlynegligibleperformancedropsonquestion-answeringbenchmark.Theyidentiﬁedoptimallayersforremovalbasedonsimilarityandthen“healed”themodelthroughsmallamountsofefﬁcientﬁne-tuning.NVIDIAresearcherstookamoreradicalapproachbypruninglayers,neurons,attentionheads,andembeddings,andthenusingknowledgedistillationforefﬁcientretraining.achievedcomparableorsuperiorperformancetomodelslikeMistral7BandLlama-38Bwhileusingupto40xfewertrainingtokens.Introduction|Research|Industry|Politics|SafetyGooglehaveembracedthisapproach,distillingGemini1.5FlashfromGemini1.5Pro,whileGemma29BwasdistilledfromGemma227B,andGemma2Bfromalargerunreleasedmodel.ThereisalsocommunityspeculationthatClaude3Haiku,ahighlycapablesmallermodel,isadistilledversionofthelargerOpus,butAnthropichasneverconﬁrmedthis.Thesedistillationeffortsaregoingmultimodaltoo.BlackForestLabshavereleasedFLUX.1dev,anopen-weighttext-to-imagedistilledfromtheirPromodel.Tosupporttheseefforts,thecommunityhasstartedtoproduceopen-sourcedistillationtools,likearcee.ai’sDistillKit,whichsupportsbothLogit-basedandHiddenStates-baseddistillation.Llama3.1405Bisalsobeingusedfordistillation,afterMetaupdateditstermssooutputlogitscanbeusedtoimproveanymodels,notjustIntroduction|Research|●Microsoft’sphi-3.5-miniisa3.8BLMthatcompeteswithlargermodelslike7BandLlama3.18B.Itperformswellonreasoningandquestion-answering,butsizerestrictsitsfactualknowledge.Toenableon-deviceinference,themodelwasquantizedto4bits,reducingitsmemoryfootprinttoapproximately1.8GB.●AppleintroducedMobileCLIP,afamilyofefﬁcientimage-textmodelsoptimizedforfastinferenceonsmartphones.Usingnovelmultimodalreinforcedtraining,theyimprovetheaccuracyofcompactmodelsbytransferringknowledgefromanimagecaptioningmodelandanensembleofstrongCLIPencoders.●HuggingFacealsogotinontheactionwithSmolLM,afamilyofsmalllanguagemodels,availablein135M,360M,and1.7Bformats.ByusingahighlycuratedsyntheticdatasetcreatedviaanenhancedversionofCosmopedia(seeslide31)theteamachievedSOTAperformanceforthesize.冒+回→●Microsoft’sBitNetusesa“BitLinear”layertoreplacestandardlinearlayers,employing1-bitweightsandquantizedactivations.●Itshowscompetitiveperformancecomparedtofull-precisionmodelsanddemonstratesascalinglawsimilartofull-precisiontransformers,withsigniﬁcantmemoryandenergysavings.●MicrosoftfollowedupwithBitNetb1.58,withternaryweightstomatchfull-precisionLLMperformanceat3Bsizewhileretainingefﬁciencygains.●Meanwhile,ByteDance’sTiTok(Transformer-based1-DimensionalTokenizer)quantizesimagesintocompact1Dsequencesofdiscretetokenforimagereconstructionandgenerationtasks.Thisallowsimagestoberepresentedwithasfewas32tokens,insteadofhundredsorthousands.Introduction|Research|●Inspiredbymodelinterpretabilityresearch,ReFT(RepresentationFine-tuning)doesn’talterthemodel’sweights.Instead,itmanipulatesthemodel’sinternalrepresentationsatinferencetimetosteeritsbehavior.Whileitcomeswithaslightinterferencepenalty,ReFTrequires15-65xfewerparameterscomparedtoweight-basedﬁne-tuningmethods.Italsoenablesmoreselectiveinterventionsonspeciﬁclayersandtokenpositions,enablingﬁne-grainedcontrolovertheadaptationprocess.Theresearchersshowitspotentialinfew-shotadaptationwhereachatmodelisgivenanewpersonawithjustﬁveexamples.Combinedwiththesmallstoragefootprintforlearnedinterventions,itcouldbeusedforreal-timepersonalizationondeviceswithsufﬁcientcomputepower.Introduction|Research|Industry|Politics|Safety●Selectivestate-spacemodelslikeMamba,designedlastyeartohandlelongsequencesmoreefﬁciently,cantosomeextentcompetewithtransformers,butlagontasksthatrequirecopyingorin-contextlearning.Thatsaid,Falcon’sMamba7Bshowsimpressivebenchmarkperformanceversussimilar-sizedtransformermodels.●Hybridmodelsappeartobeamorepromisingdirection.Combinedwithself-attentionandMLPlayers,theAI21’sMamba-Transformerhybridmodeloutperformsthe8BTransformeracrossknowledgeandreasoningbenchmarks,whilebeingupto8xfastergeneratingtokensininference.●Inanostalgiatrip,thereareearlysignsofacomebackforrecurrentneuralnetworks,whichhadfallenoutoffashionduetotrainingandscalingdifﬁculties.●Grifﬁn,trainedbyGoogleDeepMind,mixeslinearrecurrencesandlocalattention,holdingitsownagainstLlama-2whilebeingtrainedon6xfewertokens.Introduction|Research|Industry|Politics|Safety●MOHAWKisanewmethodfordistillingknowledgefromalarge,pre-trainedtransformermodel(teacher)toasmaller,subquadraticmodel(student)likeastate-spacemodel(SSM).●Italignsi)thesequencetransformationmatricesofthestudentandteachermodelsii)andthehiddenstatesofeachlayer,theniii)transferstheremainingweightsoftheteachermodeltothestudentmodeltoﬁnetuneit.●●TheauthorscreatePhi-Mamba,anewstudentmodelcombiningMamba-2andanMLPblockandavariantcalledHybrid-Phi-Mambathatretainssomeattentionlayersfromtheteachermodel.MohawkcantrainPhi-MambaandHybrid-Phi-Mambatoachieveperformanceclosetotheteachermodel.Phi-Mambaisdistilledwithonly3Btokens,lessthan1%ofthedatausedtotraineitherthepreviouslybest-performingMambamodelsand2%forthePhi-1.5modelitself.Introduction|Research|●AswellasbeingthemainsourceoftrainingdataforthePhifamily,syntheticdatawasusedbyAnthropicwhentrainingClaude3tohelprepresentscenariosthatmighthavebeenmissinginthetrainingdata.●HuggingFaceusedMixtral-8x7BInstructtogenerateover30Mﬁlesand25Btokensofsynthetictextbooks,blogposts,andstoriestorecreatethePhi-1.5trainingdataset,whichtheydubbedCosmopedia.●Tomakethisprocesseasier,NVIDIAreleasedtheNemotron-4-340Bfamily,asuiteofmodelsdesignedspeciﬁcallyforsyntheticdatageneration,availableviaapermissivelicense.Meta’sLlamacanalsobeusedforsyntheticdatageneration.●Italsoappearspossibletocreatesynthetichigh-qualityinstructiondatabyextractingitdirectlyfromanalignedLLM,withtechniqueslikeMagpie.Modelsﬁne-tunedthiswaysometimesperformcomparablytoLlama-3-8B-Instruct.Introduction|Research|●ANaturepaperfromOxfordandCambridgeresearchersfoundmodelcollapseoccursacrossvariousAIarchitectures,includingﬁne-tunedlanguagemodels,challengingtheideathatpre-trainingorperiodicexposuretosmallamountsoforiginaldatacanpreventdegradation(measuredbyPerplexityscore).●Thiscreatesa“ﬁrstmoveradvantage”,assustainedaccesstodiverse,human-generateddatawillbecomeincreasinglycriticalformaintainingmodelquality.●However,theseresultsareprimarilyfocusedonascenariowhererealdataisreplacedwithsyntheticdataovergenerations.Inpractise,realandsyntheticdatausuallyaccumulates.●Otherresearchsuggeststhat,providedtheproportionofsyntheticdatadoesn’tgettoohigh,collapsecanusuallybeavoided.Introduction|Research|●FineWeb,thedataset,wascreatedthroughamulti-stepprocessincludingbaseﬁltering,independentMinHashdeduplicationperdump,selectedﬁltersderivedfromtheC4dataset,andtheteam’scustomﬁlters.●ThetextextractionusingthetraﬁlaturalibraryproducedhigherqualitydatathandefaultCommonCrawlWETﬁles,eventhoughtheresultingdatasetwasmeaningfullysmaller.●Theyfounddeduplicationdroveperformanceimprovements,uptoapoint,beforehittingapointofdiminishingreturns,andthenworseningit.Theteamalsousedllama-3-70b-instructtoannotate500ksamplesfromFineWeb,scoringscoringeachfortheireducationalqualityonascalefrom0to5.FineWeb-edu,whichﬁlteredoutsamplesscoredbelow3,outperformedFineWebandallotheropendatasets,despitebeingsigniﬁcantlysmaller.Introduction|Research|●Followingtheplaybookthat’sproveneffectiveinregularLLMs,massiveperformanceimprovementshavecomefromscale(GritLMhas~47Bparametersvsthe110Mcommonamongpriorembeddingmodels).Similarly,theusageofbroadwebscalecorporaandimprovedﬁlteringmethodshaveledtolargeimprovementsinthesmallermodels.Meanwhile,ColPaliisavision-languageembeddingmodelthatexploitsthevisualstructureofdocuments,notjusttheirtextembeddings,toimproveretrieval.Retrievalmodelsareoneofthefewsubdomainswhereopenmodelscommonlyoutperformproprietarymodelsfromthebiggestlabs.OntheMTEBRetrievalLeadIntroduction|Research|●Anthropicsolvedthisusing‘contextualembeddings’,whereapromptinstructsthemodeltogeneratetextexplainingthecontextofeachchunkinthedocument.Theyfoundthatthisapproachleadstoareductionoftop-20retrievalfailurerateof35%(5.7%→3.7%).ItcanthenbescaledusingAnthropic’spromptcaching.AsFernandoDiazofCMUobservedinarecentthread,thisisagreatexampleoftechniquespioneeredononeareaofAIresearch(e.g.earlyspeechretrievalanddocumentexpansionwork)beingappliedtoanother.Anotherversionof“whatisResearchfromChromashowsthatthechoiceofchunkingstrategycanaffectretrievalperformancebyupto9%inrecall.Introduction|Research|●Researchersarenowpioneeringnovelapproaches,likeRagnarök,whichintroducesanovelweb-basedarenaforhumanevaluationthroughpairwisesystemcomparisons.ThisaddressesthechallengeofassessingRAGqualitybeyondtraditionalautomatedmetrics.●Meanwhile,ResearchyQuestionsprovidesalarge-scalecollectionofcomplex,multi-facetedquestionsthatrequirein-depthresearchandanalysistoanswer,drawnfromrealuserqueries.●GoogleDeepMindhasproposedDistributedLow-Communication(DiLoCo),anoptimizationalgorithmthatallowstrainingtooccuronmultiplelooselyconnected“islands”ofdevices.●Eachislandperformsalargenumberoflocalupdatestepsbeforecommunicatingwiththeothers,reducingtheneedforfrequentdataexchange.They’reabletodemonstratefullysynchronousoptimizationacross8oftheseislandswhilereducingcommunication500x.●GDMalsoproposedareﬁnedversionofDiLoCo,optimizedforasynchronoussettings.●ResearchersatPrimeIntellectreleasedanopen-sourceimplementationandreplicationofDiLoCo,whilescalingitup3x,todemonstrateitseffectivenesson1Bparametermodels.

人人文库> 全部分类> 应用文书 > 研究报告

温馨提示

1. 本站所有资源如无特殊说明，都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
2. 本站的文档不包含任何第三方提供的附件图纸等，如果需要附件，请联系上传者。文件的所有权益归上传用户所有。
3. 本站RAR压缩包中若带图纸，网页内容里面会有图纸预览，若没有图纸预览就没有图纸。
4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
5. 人人文库网仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对用户上传分享的文档内容本身不做任何修改或编辑，并不能对任何下载内容负责。
6. 下载文件中如有侵权或不适当内容，请与我们联系，我们立即纠正。
7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

2024年AI现状报告-2024年10月版（英文）-Air+Street资本

文档简介

温馨提示

最新文档

评论

2024年AI现状报告-2024年10月版（英文）-Air+Street资本

文档简介

温馨提示

最新文档

评论

相关文档