2024年AI现状报告-2024年10月版(英文)-Air+Street资本_第1页
2024年AI现状报告-2024年10月版(英文)-Air+Street资本_第2页
2024年AI现状报告-2024年10月版(英文)-Air+Street资本_第3页
2024年AI现状报告-2024年10月版(英文)-Air+Street资本_第4页
2024年AI现状报告-2024年10月版(英文)-Air+Street资本_第5页
已阅读5页,还剩421页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

venturecapitalfirminvestinginAI-firstcompanies.HerunstheResearchandAppliedAISummit(RAAIS),theRAAISFoundation(fundingopen-sourceAIprojects),AIcommunitiesintheUSandEurope,andSpinout.fyi(improvinguniversityspinoutcreation).HestudiedbiologyatWilliamsCollegeandearnedaPhDfromCambridgeincancerresearchasaGatesScholar.regularlywritesresearch,analysis,andcommentaryonanassociatedirectoratMilltownPartners,whereheadvisedbigtechnologycompanies,start-ups,andinvestorsonpolicyandpositioning.HegraduatedfromtheUniversityofOxfordin2017withadegreeinHistory.Artificialintelligence(AI)isamultidisciplinaryfieldofscienceandengineeringwhosegoalistocreateintelligentmachines.WebelievethatAIwillbeaforcemultiplierontechnologicalprogressinourincreasinglydigital,data-drivenworld.Thisisbecauseeverythingaroundustoday,rangingfromculturetoconsumerproducts,isaproductofintelligence.TheStateofAIReportisnowinitsseventhyear.Considerthisreportasacompilationofthemostinterestingthingswe’veseenwithagoaloftriggeringaninformedconversationaboutthestateofAIanditsimplicationforthefuture.Weconsiderthefollowingkeydimensionsinourreport:-Research:Technologybreakthroughsandtheircapabilities.-Industry:AreasofcommercialapplicationforAIanditsbusinessimpact.-Politics:RegulationofAI,itseconomicimplicationsandtheevolvinggeopoliticsofAI.-Safety:Identifyingandmitigatingcatastrophicrisksthathighly-capablefutureAIsystemscouldposetous.-Predictions:Whatwebelievewillhappeninthenext12monthsanda2023performancereviewtokeepushonest.Artificialintelligence(AI):abroaddisciplinewiththegoalofcreatingintelligentmachines,asopposedtothenaturalintelligencethatisdemonstratedbyhumansandanimals.Artificialgeneralintelligence(AGI):atermusedtodescribefuturemachinesthatcouldmatchandthenexceedthefullrangeofhumancognitiveabilityacrossalleconomicallyvaluabletasks.AIAgent:anAI-poweredsystemthatcantakeactionsinanenvironment.Forexample,anLLMthathasaccesstoasuiteoftoolsandhastodecidewhichonetouseinordertoaccomplishataskthatithasbeenpromptedtodo.AISafety:afieldthatstudiesandattemptstomitigatetherisks(minortocatastrophic)whichfutureAIcouldposetohumanity.Computervision(CV):theabilityofaprogramtoanalyseandunderstandimagesandvideo.Deeplearning(DL):anapproachtoAIinspiredbyhowneuronsinthebrainrecognisecomplexpatternsindata.The“deep”referstothemanylayersofneuronsintoday’smodelsthathelptolearnrichrepresentationsofdatatoachievebetterperformancegains.Diffusion:Analgorithmthatiterativelydenoisesanartificiallycorruptedsignalinordertogeneratenew,high-qualityoutputs.Inrecentyearsithasbeenattheforefrontofimagegenerationandproteindesign.GenerativeAI:AfamilyofAIsystemsthatarecapableofgeneratingnewcontent(e.g.text,images,audio,or3Dassets)basedon'prompts'.GraphicsProcessingUnit(GPU):asemiconductorprocessingunitthatenablesalargenumbercalculationstobecomputedinparallel.Historicathiswasrequiredforrenderingcomputergraphics.Since2012GPUshaveadaptedfortrainingDLmodels,whichalsorequirealargenumberofparallelcalculations.(Large)Languagemodel(LM,LLM):amodeltrainedonvastamountsof(often)textualdatatopredictthenextwordinaself-supervisedmanner.Theterm“LLM”isusedtodesignatemulti-billionparameterLMs,butthisisamovingdefinition.Machinelearning(ML):asubsetofAIthatoftenusesstatisticaltechniquestogivemachinestheabilityto"learn"fromdatawithoutbeingexplicitlygiventheinstructionsforhowtodoso.Thisprocessisknownas“training”a“model”usingalearning“algorithm”thatprogressivelyimprovesmodelperformanceonaspecifictask.Model:aMLalgorithmtrainedondataandusedtomakepredictions.Naturallanguageprocessing(NLP):theabilityofaprogramtPrompt:auserinputoftenwritteninnaturallanguagethatisusedtoinstructanLLMtogeneratesomethingortakeaction.Reinforcementlearning(RL):anareaofMLinwhichsoftwareagentslearngoal-orientedbehaviorbytrialanderrorinanenvironmentthatprovidesrewardsorpenaltiesinresponsetotheiractions(calleda“policy”)towardsachievingthatgoal.Self-supervisedlearning(SSL):aformofunsupervisedlearning,wheremanuallylabeleddataisnotneeded.Rawdataisinsteadmodifiedinanautomatedwaytocreateartificiallabelstolearnfrom.AnexampleofSSLislearningtocompletetextbymaskingrandomwordsinasentenceandtryingtopredictthemissingones.Transformer:amodelarchitectureatthecoreofmoststateoftheart(SOTA)MLresearch.Itiscomposedofmultiple“attention”layerswhichlearnwhichpartsoftheinputdataarethemostimportantforagiventask.TransformersstartedinNLP(specificallymachinetranslation)andsubsequentlywereexpandedintocomputervision,audio,andothermodalities.Intherestoftheslides,iconsinthetoprightcornerindicateinputandoutputmodalitiesforthemodel.:Softwaretooluse(text,codegeneration&execution) .:3D:Robotstate:Biologicalmodality→:TexttoSoftwaretooluse→.:Imageto3D→.:Textto3D-Frontierlabperformanceconverges,butOpenAImaintainsitsedgefollowingthelaunchofo1,asplanningandreasoningemergeasamajorfrontier.-Foundationmodelsdemonstratetheirabilitytobreakoutoflanguageasmultimodalresearchdrivesintomathematics,biology,genomics,thephysicalsciences,andneuroscience.-USsanctionsfailtostopChinese(V)LLMsrisingupcommunityleaderboards.-NVIDIAremainsthemostpowerfulcompanyintheworld,enjoyingastintinthe$3Tclub,whileregulatorsprobetheconcentrationsofpowerwithinGenAI.-MoreestablishedGenAIcompaniesbringinbillionsofdollarsinrevenue,whilestart-upsbegintogaintractioninsectorslikevideoandaudiogeneration.Althoughcompaniesbegintomakethejourneyfrommodeltoproduct,long-termquestionsaroundpricingandsustainabilityremainunresolved.-Drivenbyabullruninpublicmarkets,AIcompaniesreach$9Tinvalue,whileinvestmentlevelsgrowhealthilyinprivatecompanies.-Whileglobalgovernanceeffortsstall,nationalandregionalAIregulationhascontinuedtoadvance,withcontroversiallegislationpassingintheUSandEU.-TherealityofcomputerequirementsforcesBigTechcompaniestoreckonwithreal-worldphysicalconstraintsonscalingandtheirownemissionstargets.Meanwhile,governments’ownattemptstobuildcapacitycontinuetolag.-AnticipatedAIeffectsonelections,employmentandarangeofothersensitiveareasareyettoberealizedatanyscale.-Avibe-shiftfromsafetytoaccelerationtakesplaceascompaniesthatpreviouslywarnedusaboutthependingextinctionofhumanityneedtorampupenterprisesalesandusageoftheirconsumerapps.-GovernmentsaroundtheworldemulatetheUKinbuildingupstatecapacityaroundAIsafety,launchinginstitutesandstudyingcriticalnationalinfrastructureforpotentialvulnerabilities.-Everyproposedjailbreaking‘fix’hasfailed,butresearchersareincreasinglyconcernedwithmoresophisticated,long-termattacks.stateof.ai2024AHollywood-gradeproductionmakesuseofgenerativeAIforvisualeffects.AgenerativeAImediacompanyisinvestigatedforitsmisuseduringinthe2024USelectioncircuit.Self-improvingAIagentscrushSOTAinacomplexenvironment(e.g.AAAgame,tooluse,science).TechIPOmarketsunthawandweseeatleastonemajorlistingforanAI-focusedcompany(e.g.DBRX).TheGenAIscalingcrazeseesagroupspend>$1Btotrainasinglelarge-scalemodel.TheUS’sFTCorUK’sCMAinvestigatetheMicrosoft/OpenAIdealoncompetitiongroundWeseelimitedprogressonglobalAIgovernancebeyondhigh-levelvoluntarycommitments.FinancialinstitutionslaunchGPUdebtfundstoreplaceVCequitydollarsforcomputefunding.AnAI-generatedsongbreaksintotheBillboardHot100Top10ortheSpotifyTopHits2024.Asinferenceworkloadsandcostsgrowsignificantly,alargeAIcompany(e.g.OpenAI)acquiresorbuildsaninference-focusedAIchipcompany.~~Largelybadly,butGenAIAIvisualeffectshavebeenseeninNetflixandHBOproductions.Notyet,butthere’sstilltime.Notyet,despitepromisingworkonopen-endedness,includingstronggameperformance.WhiletheMagnificentSevenhaveenjoyedstronggains,privatecompaniesarehangingonuntilmarketssettle.However,AIchipcompanyCerebrashasfiledtoIPO.Notquiteyet-let’sgiveitanotheryear.Bothregulatorsareinvestigatingthispartnership.ThecommitmentsfromBletchleyandSeoulsummitsremainvoluntaryandhigh-level.SomeVCfundsarerumoredtobeofferingGPUsforequity,butwe’reyettoseeanyonegodownthedebtroute.Itturnsoutthishadalreadyhappenedlastyearwith“HeartonMySleeve”,butwe’vealsoseenanAI-generatedsongreach#27inGermanyandspendseveraldaysintheTop50.SamAltmanisreportedlyraisinghugesumsofmoneytodothis,whileeachofGoogle,Amazon,MetaandMicrosoftcontinuetobuildandimprovetheirownedAIsilicon.Introduction|Research|Industry|Politics|Safety●●Onbothformalbenchmarksandvibes-basedanalysis,thebest-fundedfrontierlabsareabletorackupscoreswithinlowsingledigitsofeachotheronindividualcapabilities.Modelsarenowconsistentlyhighlycapablecoders,arestrongatfactualrecallandmath,butlessgoodatopen-endedquestion-answeringandmulti-modalproblemsolving.Manyofthevariationsaresufficientlysmallthattheyarenowlikelytobetheproductofdifferencesinimplementation.Forexample,GPT-4ooutperformsClaude3.5SonnetonMMLU,butapparentlyunderperformsitonMMLU-Pro-abenchmarkdesignedtobemorechallenging.Consideringtherelativelysubtletechnicaldifferencesbetweenarchitecturesandlikelyheavyoverlapsinpre-trainingdata,modelbuildersarenowincreasinglyhavingtocompeteonnewcapabilitiesandproductIntroduction|Research|Industry|Politics|Safety●Byshiftingcomputefrompre-andpost-trainingtoinference,o1reasonsthroughcomplexpromptsstep-by-stepinachain-of-thought(COT)style,employingRLtosharpentheCOTandthestrategiesituses.Thisunlocksthepossibilityofsolvingmulti-layeredmath,science,andcodingproblemswhereLLMshavehistoricallystruggled,duetotheinherentlimitationsofnext-tokenprediction.OpenAIreportsignificantimprovementsonreasoning-heavybenchmarksversus4o,withthestarkestonAIME2024(competitionmath),withawhoppingscoreof83.83versus13.4.However,thiscapabilitycomesatasteepprice:1Minputtokensofo1-previewcosts$15,while1Moutputtokenswillsetyouback$60.Thismakesit3-4xmoreexpensivethanGPT-4o.OpenAIisclearinitsAPIdocumentationthatitisnotalike-for-like4oreplacementandthatitisnotthebestmodelfortasksthatrequireconsistentlyquickresponses,imageIntroduction|Research|Industry|Politics|SafetyIntroduction|Research|Industry|Politics|Safety●Metastucktothesamedecoder-onlytransformerarchitecturethatit’susedsinceLlama1,withminoradaptations,namelymoretransformerlayersandattentionheads.Metausedanincredible15Ttokenstotrainthefamily.Whilethisblewthroughthe“Chinchilla-optimal”amountoftrainingcompute,theyfoundthatboththe8Band70Bmodelsimprovedlog-linearlyupto15T.Llama3.1405Bwastrainedover16,000H100GPUs,thefirstLlamamodeltrainedatthisscale.MetafollowedupwithLlama3.2inSeptember,whichincorporated11Band90BVLMs(Llama’smultimodaldebut).TheformerwascompetitivewithClaude3Haiku,thelatterwithGPT-4o-mini.Thecompanyalsoreleased1Band3Btext-onlymodels,designedtooperateon-device.Llama-basedmodelshavenowrackedupover440MdownloadsonHuggingIntroduction|Research|Industry|Politics|SafetyIntroduction|Research|Industry|Politics|SafetyIntroduction|Research|Industry|Politics|Safety●●AteamfromtheUniversityofEdinburghflaggedupthenumberofmistakesinMMLU,includingthewronggroundtruth,unclearquestions,andmultiplecorrectanswers.Whilelowacrossmostindividualtopics,therewerebigspikesincertainfields,suchasvirology,where57%oftheanalyzedinstancescontainederrors.OnamanuallycorrectedMMLUsubset,modelsbroadlygaininperformance,althoughworsenedonprofessionallawandformallogic.ThissaysinaccurateMMLUinstancesarebeinglearnedduringpre-training.Inmoresafety-criticalterritory,OpenAIhaswarnedthatSWE-bench,whichevaluatesmodels’abilitytosolvereal-worldsoftwareissues,wasunderestimatingtheautonomoussoftwareengineeringcapabilitiesofmodels,asitcontainedtasksthatwerehardorimpossibletosolve.Theresearcherspartneredwiththecreatorsofthebenchmarktocreate●Thearena,whichallowsuserstointeractwithtworandomlyselectedchatbotsside-by-sideprovidesaroughcrowdsourcedevaluation.receivingthesamescores,withthelatteralsooutperformingClaudeSonnet3.5.Thishasledtoconcernsthattherankingisessentiallybecomingawayofassessingwhichwritingstyleusershappentoprefermost.Additionally,assmallermodelstendtoperformlesswellontasksinvolvingmoretokens,the8kcontextlimitarguablygivesthemanunfairadvantage.However,theearlyversionofthevisionleaderboardisnowbeginningtogaintractionandalignsbetterwithotherevals.冒+回→●AGoogleDeepMind/NYUteamgeneratedmillionsofsynthetictheoremsandproofsusingsymbolicengines,usingthemtotrainalanguagemodelfromscratch.●AlphaGeometryalternatesbetweenthelanguagemodelproposingnewconstructionsandsymbolicenginesperformingdeductionsuntilasolutionisfound.●Impressively,Itsolved25outof30onabenchmarkofOlympiad-levelgeometryproblems,nearinghumanInternationalMathematicalOlympiadgoldmedalistperformance.ThenextbestAIperformancescoredonly10.●Italsodemonstratedgeneralisationcapabilities-forexample,findingthataspecificdetailina2004IMOproblemwasunnecessarytofortheproof.Introduction|Research|Industry|Politics|Safety●AMeta/MITteamlookingatopen-weightpre-trainedLLMsconcludedthatit’spossibletodoawaywithuptohalfamodel’slayersandsufferonlynegligibleperformancedropsonquestion-answeringbenchmark.Theyidentifiedoptimallayersforremovalbasedonsimilarityandthen“healed”themodelthroughsmallamountsofefficientfine-tuning.NVIDIAresearcherstookamoreradicalapproachbypruninglayers,neurons,attentionheads,andembeddings,andthenusingknowledgedistillationforefficientretraining.achievedcomparableorsuperiorperformancetomodelslikeMistral7BandLlama-38Bwhileusingupto40xfewertrainingtokens.Introduction|Research|Industry|Politics|SafetyGooglehaveembracedthisapproach,distillingGemini1.5FlashfromGemini1.5Pro,whileGemma29BwasdistilledfromGemma227B,andGemma2Bfromalargerunreleasedmodel.ThereisalsocommunityspeculationthatClaude3Haiku,ahighlycapablesmallermodel,isadistilledversionofthelargerOpus,butAnthropichasneverconfirmedthis.Thesedistillationeffortsaregoingmultimodaltoo.BlackForestLabshavereleasedFLUX.1dev,anopen-weighttext-to-imagedistilledfromtheirPromodel.Tosupporttheseefforts,thecommunityhasstartedtoproduceopen-sourcedistillationtools,likearcee.ai’sDistillKit,whichsupportsbothLogit-basedandHiddenStates-baseddistillation.Llama3.1405Bisalsobeingusedfordistillation,afterMetaupdateditstermssooutputlogitscanbeusedtoimproveanymodels,notjustIntroduction|Research|●Microsoft’sphi-3.5-miniisa3.8BLMthatcompeteswithlargermodelslike7BandLlama3.18B.Itperformswellonreasoningandquestion-answering,butsizerestrictsitsfactualknowledge.Toenableon-deviceinference,themodelwasquantizedto4bits,reducingitsmemoryfootprinttoapproximately1.8GB.●AppleintroducedMobileCLIP,afamilyofefficientimage-textmodelsoptimizedforfastinferenceonsmartphones.Usingnovelmultimodalreinforcedtraining,theyimprovetheaccuracyofcompactmodelsbytransferringknowledgefromanimagecaptioningmodelandanensembleofstrongCLIPencoders.●HuggingFacealsogotinontheactionwithSmolLM,afamilyofsmalllanguagemodels,availablein135M,360M,and1.7Bformats.ByusingahighlycuratedsyntheticdatasetcreatedviaanenhancedversionofCosmopedia(seeslide31)theteamachievedSOTAperformanceforthesize.冒+回→●Microsoft’sBitNetusesa“BitLinear”layertoreplacestandardlinearlayers,employing1-bitweightsandquantizedactivations.●Itshowscompetitiveperformancecomparedtofull-precisionmodelsanddemonstratesascalinglawsimilartofull-precisiontransformers,withsignificantmemoryandenergysavings.●MicrosoftfollowedupwithBitNetb1.58,withternaryweightstomatchfull-precisionLLMperformanceat3Bsizewhileretainingefficiencygains.●Meanwhile,ByteDance’sTiTok(Transformer-based1-DimensionalTokenizer)quantizesimagesintocompact1Dsequencesofdiscretetokenforimagereconstructionandgenerationtasks.Thisallowsimagestoberepresentedwithasfewas32tokens,insteadofhundredsorthousands.Introduction|Research|●Inspiredbymodelinterpretabilityresearch,ReFT(RepresentationFine-tuning)doesn’talterthemodel’sweights.Instead,itmanipulatesthemodel’sinternalrepresentationsatinferencetimetosteeritsbehavior.Whileitcomeswithaslightinterferencepenalty,ReFTrequires15-65xfewerparameterscomparedtoweight-basedfine-tuningmethods.Italsoenablesmoreselectiveinterventionsonspecificlayersandtokenpositions,enablingfine-grainedcontrolovertheadaptationprocess.Theresearchersshowitspotentialinfew-shotadaptationwhereachatmodelisgivenanewpersonawithjustfiveexamples.Combinedwiththesmallstoragefootprintforlearnedinterventions,itcouldbeusedforreal-timepersonalizationondeviceswithsufficientcomputepower.Introduction|Research|Industry|Politics|Safety●Selectivestate-spacemodelslikeMamba,designedlastyeartohandlelongsequencesmoreefficiently,cantosomeextentcompetewithtransformers,butlagontasksthatrequirecopyingorin-contextlearning.Thatsaid,Falcon’sMamba7Bshowsimpressivebenchmarkperformanceversussimilar-sizedtransformermodels.●Hybridmodelsappeartobeamorepromisingdirection.Combinedwithself-attentionandMLPlayers,theAI21’sMamba-Transformerhybridmodeloutperformsthe8BTransformeracrossknowledgeandreasoningbenchmarks,whilebeingupto8xfastergeneratingtokensininference.●Inanostalgiatrip,thereareearlysignsofacomebackforrecurrentneuralnetworks,whichhadfallenoutoffashionduetotrainingandscalingdifficulties.●Griffin,trainedbyGoogleDeepMind,mixeslinearrecurrencesandlocalattention,holdingitsownagainstLlama-2whilebeingtrainedon6xfewertokens.Introduction|Research|Industry|Politics|Safety●MOHAWKisanewmethodfordistillingknowledgefromalarge,pre-trainedtransformermodel(teacher)toasmaller,subquadraticmodel(student)likeastate-spacemodel(SSM).●Italignsi)thesequencetransformationmatricesofthestudentandteachermodelsii)andthehiddenstatesofeachlayer,theniii)transferstheremainingweightsoftheteachermodeltothestudentmodeltofinetuneit.●●TheauthorscreatePhi-Mamba,anewstudentmodelcombiningMamba-2andanMLPblockandavariantcalledHybrid-Phi-Mambathatretainssomeattentionlayersfromtheteachermodel.MohawkcantrainPhi-MambaandHybrid-Phi-Mambatoachieveperformanceclosetotheteachermodel.Phi-Mambaisdistilledwithonly3Btokens,lessthan1%ofthedatausedtotraineitherthepreviouslybest-performingMambamodelsand2%forthePhi-1.5modelitself.Introduction|Research|●AswellasbeingthemainsourceoftrainingdataforthePhifamily,syntheticdatawasusedbyAnthropicwhentrainingClaude3tohelprepresentscenariosthatmighthavebeenmissinginthetrainingdata.●HuggingFaceusedMixtral-8x7BInstructtogenerateover30Mfilesand25Btokensofsynthetictextbooks,blogposts,andstoriestorecreatethePhi-1.5trainingdataset,whichtheydubbedCosmopedia.●Tomakethisprocesseasier,NVIDIAreleasedtheNemotron-4-340Bfamily,asuiteofmodelsdesignedspecificallyforsyntheticdatageneration,availableviaapermissivelicense.Meta’sLlamacanalsobeusedforsyntheticdatageneration.●Italsoappearspossibletocreatesynthetichigh-qualityinstructiondatabyextractingitdirectlyfromanalignedLLM,withtechniqueslikeMagpie.Modelsfine-tunedthiswaysometimesperformcomparablytoLlama-3-8B-Instruct.Introduction|Research|●ANaturepaperfromOxfordandCambridgeresearchersfoundmodelcollapseoccursacrossvariousAIarchitectures,includingfine-tunedlanguagemodels,challengingtheideathatpre-trainingorperiodicexposuretosmallamountsoforiginaldatacanpreventdegradation(measuredbyPerplexityscore).●Thiscreatesa“firstmoveradvantage”,assustainedaccesstodiverse,human-generateddatawillbecomeincreasinglycriticalformaintainingmodelquality.●However,theseresultsareprimarilyfocusedonascenariowhererealdataisreplacedwithsyntheticdataovergenerations.Inpractise,realandsyntheticdatausuallyaccumulates.●Otherresearchsuggeststhat,providedtheproportionofsyntheticdatadoesn’tgettoohigh,collapsecanusuallybeavoided.Introduction|Research|●FineWeb,thedataset,wascreatedthroughamulti-stepprocessincludingbasefiltering,independentMinHashdeduplicationperdump,selectedfiltersderivedfromtheC4dataset,andtheteam’scustomfilters.●ThetextextractionusingthetrafilaturalibraryproducedhigherqualitydatathandefaultCommonCrawlWETfiles,eventhoughtheresultingdatasetwasmeaningfullysmaller.●Theyfounddeduplicationdroveperformanceimprovements,uptoapoint,beforehittingapointofdiminishingreturns,andthenworseningit.Theteamalsousedllama-3-70b-instructtoannotate500ksamplesfromFineWeb,scoringscoringeachfortheireducationalqualityonascalefrom0to5.FineWeb-edu,whichfilteredoutsamplesscoredbelow3,outperformedFineWebandallotheropendatasets,despitebeingsignificantlysmaller.Introduction|Research|●Followingtheplaybookthat’sproveneffectiveinregularLLMs,massiveperformanceimprovementshavecomefromscale(GritLMhas~47Bparametersvsthe110Mcommonamongpriorembeddingmodels).Similarly,theusageofbroadwebscalecorporaandimprovedfilteringmethodshaveledtolargeimprovementsinthesmallermodels.Meanwhile,ColPaliisavision-languageembeddingmodelthatexploitsthevisualstructureofdocuments,notjusttheirtextembeddings,toimproveretrieval.Retrievalmodelsareoneofthefewsubdomainswhereopenmodelscommonlyoutperformproprietarymodelsfromthebiggestlabs.OntheMTEBRetrievalLeadIntroduction|Research|●Anthropicsolvedthisusing‘contextualembeddings’,whereapromptinstructsthemodeltogeneratetextexplainingthecontextofeachchunkinthedocument.Theyfoundthatthisapproachleadstoareductionoftop-20retrievalfailurerateof35%(5.7%→3.7%).ItcanthenbescaledusingAnthropic’spromptcaching.AsFernandoDiazofCMUobservedinarecentthread,thisisagreatexampleoftechniquespioneeredononeareaofAIresearch(e.g.earlyspeechretrievalanddocumentexpansionwork)beingappliedtoanother.Anotherversionof“whatisResearchfromChromashowsthatthechoiceofchunkingstrategycanaffectretrievalperformancebyupto9%inrecall.Introduction|Research|●Researchersarenowpioneeringnovelapproaches,likeRagnarök,whichintroducesanovelweb-basedarenaforhumanevaluationthroughpairwisesystemcomparisons.ThisaddressesthechallengeofassessingRAGqualitybeyondtraditionalautomatedmetrics.●Meanwhile,ResearchyQuestionsprovidesalarge-scalecollectionofcomplex,multi-facetedquestionsthatrequirein-depthresearchandanalysistoanswer,drawnfromrealuserqueries.●GoogleDeepMindhasproposedDistributedLow-Communication(DiLoCo),anoptimizationalgorithmthatallowstrainingtooccuronmultiplelooselyconnected“islands”ofdevices.●Eachislandperformsalargenumberoflocalupdatestepsbeforecommunicatingwiththeothers,reducingtheneedforfrequentdataexchange.They’reabletodemonstratefullysynchronousoptimizationacross8oftheseislandswhilereducingcommunication500x.●GDMalsoproposedarefinedversionofDiLoCo,optimizedforasynchronoussettings.●ResearchersatPrimeIntellectreleasedanopen-sourceimplementationandreplicationofDiLoCo,whilescalingitup3x,todemonstrateitseffectivenesson1Bparametermodels.

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论