版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
1
arXiv:2404.01335v1[cs.LG]30Mar2024
GenerativeAIforArchitecturalDesign:ALiteratureReview
ChengyuanLi1TianyuZhang2XushengDu2YeZhang1*HaoranXie21TianjinUniversity2JapanAdvancedInstituteofScienceandTechnology
Abstract
GenerativeArtificialIntelligence(AI)haspioneerednewmethodologicalparadigmsinarchitecturaldesign,signifi-cantlyexpandingtheinnovativepotentialandefficiencyofthedesignprocess.Thispaperexplorestheextensiveappli-cationsofgenerativeAItechnologiesinarchitecturalde-sign,atrendthathasbenefittedfromtherapiddevelop-mentofdeepgenerativemodels.GenerativeAdversarialNetworks(GANs)andVariationalAutoencoder(VAE)havebeenextensivelyappliedbefore,significantlyadvancingde-signinnovationandefficiency.Withcontinualtechnolog-icaladvancements,state-of-the-artDiffusionModelsand3DGenerativeModelsareprogressivelyintegratedintoar-chitecturaldesign,offeringdesignersamorediversifiedsetofcreativetoolsandmethodologies.ThisarticlefurtherprovidesacomprehensivereviewofthebasicprinciplesofgenerativeAIandlarge-scalemodelsandhighlightstheapplicationsinthegenerationof2Dimages,videos,and3Dmodels.Inaddition,byreviewingthelatestliteraturefrom2020,thispaperscrutinizestheimpactofgenerativeAItechnologiesatdifferentstagesofarchitecturaldesign,fromgeneratinginitialarchitectural3Dformstoproduc-ingfinalarchitecturalimagery.Themarkedtrendofre-searchgrowthindicatesanincreasinginclinationwithinthearchitecturaldesigncommunitytowardsembracinggener-ativeAI,therebycatalyzingasharedenthusiasmforre-search.Theseresearchcasesandmethodologieshavenotonlyproventoenhanceefficiencyandinnovationsignifi-cantlybuthavealsoposedchallengestotheconventionalboundariesofarchitecturalcreativity.Finally,wepointoutnewdirectionsfordesigninnovationandarticulatefreshtrajectoriesforapplyinggenerativeAIinthearchitecturaldomain.Thisarticleprovidesthefirstcomprehensiveliter-aturereviewaboutgenerativeAIforarchitecturaldesign,andwebelievethisworkcanfacilitatemoreresearchworkonthissignificanttopicinarchitecture.
Keywords:GenerativeAI,ArchitecturalDesign,DiffusionModels,3DGenerativeModels,Large-scalemodels.
*correspondingauthor,zhang.ye@
Figure1.ExamplesofarchitecturedesignusinggenerativeAI
techniques:(a)churchdesign[1];(b)matrixofcuboidshapes
[2];
(c)FrankGehry’sWaltDisneyconcerthall[3];(d)Bangkokurban
design[4];(e)forestingarchitecture[4];(f)Urbaninteriors
[4]
and(g)text-to-architecturaldesign[5]
.
1.Introduction
Nowadays,generativeartificialintelligence(AI)tech-niquesincreasinglyexpandtheirpowerandrevolutioninar-chitecturaldesign.Here,generativeAIreferstotheartificialintelligencetechnologiesdedicatedtocontentgeneration,suchastext,images,music,andvideos.GenerativeAIben-efitsfromtherapiddevelopmentofdeepgenerativemodels,includingGenerativeAdversarialNetworks(GANs),Vari-ationalAutoencoder(VAE),andDiffusionModels(DMs).GANsandVAEaretraditionalgenerativemodels,andhavebeenwidelyexploredinarchitecturaldesign,asillustratedinFigure
1.Inthispaper,wefocusontherecentprogressof
generativeAI,especiallytherevolutionarydiffusionmod-els.DMsachievedstate-of-the-artperformanceinvariouscontentgenerationtaskssuchastext-to-imageandtext-to-3D-models.
Architecturaldesignmayencompassmultiplethemesandscopes,witheachprojecthavingdistinctdesignre-quirementsandindividualstyles,leadingtodiversityandcomplexityindesignapproaches.Inthiswork,weadopt6mainstepsinthearchitecturaldesignprocessforthelit-eraturereview:1)architecturalpreliminary3Dformsde-
2
sign,2)architecturallayoutdesign,3)architecturalstruc-turalsystemdesign,4)detailedandoptimizationdesignofarchitectural3Dforms,5)architecturalfacadedesign,and6)architecturalimageryexpression.Afterexploringthere-searchpapersfrom2020to2023,weobservedtherehasbeenasignificantincreaseinthenumberofresearchpapersinarchitecturaldesignusingGenerativeAI.ThenumberofresearchpapersusingGenerativeAItechnologyindifferentarchitecturaldesignstepsrevealsthedevelopmenttrendswithineachsubfield,asillustratedinFigure
2(a).Mostre
-searchesareconcentratedintheareaofarchitecturalplandesign.Researchinpreliminary3Dformdesignofarchi-tectureandarchitecturalimageexpressionhasrapidlyin-creasedinthepasttwoyears.Moreresearchneedstobedonebyscholarsonarchitectural,structuralsystemdesign,architectural3Dformrefinementandoptimizationdesign,andarchitecturalfacadedesign.
ThissustainedgrowthtrenddistinctlydemonstratesthatgenerativeAIinarchitecturaldesignareexpandingatanun-precedentedratewhilealsoreflectingthearchitecturalde-signandcomputersciencecommunityhavehighlevelofattentionandincreasinginvestmentinGenerativeAItech-nologies.ThemostusedgenerativeAItechniquesareillus-tratedinFig
2(b).Incomputerscience,manystudiesfocus
onGANandVAE,whileresearchonDDPM,LDM,andGPTisintheinitialstages.Thesituationisthesameinarchitecture.
1.1.Motivation
LeveragingtherecentgenerativeAImodelsinarchitec-turaldesigncouldsignificantlyimprovedesignefficiency,andprovidearchitectswithnewdesignprocessesandideastoexpandthepossibilitiesofarchitecturaldesignandrev-olutionizetheentiredesignprocess.However,theuseofadvancedgenerativemodelsinarchitecturaldesignhasnotbeenexploredextensively.Theprimaryreasonsforhinder-ingtheuseofadvancedgenerativemodelsinarchitecturaldesignmayhavetwoaspects:theprofessionalbarriersandtheissueoftrainingdata.
Intermsofprofessionalbarriers,deeplearningandar-chitecturaldesignarehighlyspecializedfieldsrequiringex-tensiveprofessionalknowledgeandexperience.Theaimofthisstudyistonarrowtheprofessionalbarriersbetweenar-chitectureandcomputerscience,andassistarchitecturalde-signersinbridgingGenerativeAItechnologieswithappli-cations,promotinginterdisciplinaryresearch,anddelineat-ingfutureresearchdirections.Thisreviewsystematicallyanalyzesandsummarizescasestudiesandresearchout-comesofGenerativeAIapplicationsinarchitecturaldesign,andshowcasesthepossibilitiesandpotentialoftheintersec-tionbetweencomputerscienceandarchitecture.Thisin-terdisciplinaryperspectiveencouragescollaborationamongexpertsfromdifferentfieldstoaddresscomplexissuesin
architecturaldesign,thusadvancingscientificresearchandtechnologicalinnovation.
Intermsoftheissueoftrainingdata,deeplearningmod-elsrequirehigh-qualitytrainingdatatoanalyzeandver-ifytheirgeneralizationability.However,datainthefieldofarchitectureisusuallyunstructured.Thesearchandor-ganizationofarchitecturaltrainingdataposeasignificantchallenge,makingitdifficultrightfromtheinitialstagesofmodeltraining.Inaddition,high-performanceGraphicsProcessingUnits(GPUs)arerequiredtotrainthemillionsofdatafordeeplearningmodels,especiallythosedealingwithcompleximagesanddatasets.Thescarcityofhigh-performanceGPUsandthedifficultyofmasteringGPUpro-grammingskillsmaypreventthearchitectstoexploretherecentdiffusionmodelandlargefoundationmodels.
1.2.StructureandMethodology
Thisarticlefirstintroducesthedevelopmentandapplica-tiondirectionsofgenerativeAImodels,thenelaboratesonthemethodsofapplyinggenerativeAIinthearchitecturaldesignprocess,andfinally,forecaststhepotentialapplica-tiondevelopmentofgenerativeAIinthearchitecturalfield.
Insection2,thearticleoffersanin-depthintroductiontotheprinciplesandevolutionofvariousgenerativeAImod-els,withafocusonDiffusionModels(DMs),3DGener-ativeModels,andFoundationModels.Insection2.1,thearticleelaboratesontheprinciplesanddevelopmentofVari-ationalAutoencoders(VAEs)andGenerativeAdversarialNetworks(GANs).Insection2.2,thediscourseonDif-fusionModelselaboratesontheworkingmechanismsandthedevelopmentaltrajectoriesofDDPMandLDM.Insec-tion2.3,thesegmenton3DGenerativeModelszeroesinon3Dshaperepresentation,encompassingVoxels,PointClouds,Meshes,Implicitfunctions,andOccupancyFields.WithinOccupancyFields,thepaperdetailsSignedDistanceFunctions(SDF),UnsignedDistanceFunctions(UDF),andNeuralRadianceFields(NeRF),explainingtheirrespec-tiveoperationalprinciples.Insection2.4,theFoundationModelssectioncomprehensivelydescribestheprogressandachievementsofLargeLanguageModels(LLM)andLargeVisionModels.Insection2.5,thepaperdiscussestheap-plicationsanddevelopmentsofthesemodelsinimagegen-eration,videogeneration,and3Dmodelgeneration.
Insection3,thispaperdelvesintotheapplicationde-velopmentofgenerativeAImodelsinarchitecturaldesign.Giventhecomplexityofthearchitecturaldesignprocess,thisarticledelineatesthearchitecturaldesignprocessintosixsteps,aspresentedinintroduction.Ineachstep,thearticlesummarizesanddiscussesthecurrentapplicationmethodsofgenerativeAImodelsinthesesixdomains.Byanalyzingtheseresearchpapers,thestudydemonstrateshowgenerativeAIcanfacilitateinnovationinarchitecturaldesign,improvedesignefficiency,andoptimizearchitec-
3
Figure2.OverviewofgenerativeAIapplicationsinarchitecturaldesign:statisticsonresearchpapernumbersandgenerativemodels.
turalsolutions.Throughoutthissummarizationprocess,literatureretrievalwasconductedusingdatabasessuchasCumincadandWebofScience,supplementedbysearchesonLitmaps.Toensurethetargetedandaccuratenatureofthesearch,specificsearchqueriesweresetforeachdesignprocess.
InSection4,thisarticleexploresthepotentialapplica-tionsofgenerativeAItechnologyingeneratingarchitec-turaldesignimages,architecturaldesignvideos,architec-turaldesign3Dmodels,andhuman-centricarchitecturalde-sign.Insection4.1,itanticipatesapplicationsforarchi-tecturaldesignimagegenerationingeneratingfloorplans,facadeimages,architecturalimages.Insection4.2,itan-ticipatesarchitecturaldesignvideogeneration,itforeseesapplicationssuchasgeneratingvideosfromasinglearchi-tecturalimage,generatingvideosfromarchitecturalimages,styletransferforspecificvideocontent.Insection4.3,Re-gardingarchitecturaldesign3Dmodelgeneration,itenvi-sionspossibilitiesingenerating3Dmodelsfromimagesandtextprompt,transferringstylesto3Dmodels,andgenerat-ingandeditingdetailedstylesfor3Dmodels.Insection4.4,itelaboratesonthepotentialofgenerativeAIinenhancingthehuman-centricarchitecturaldesignprocess.
2.GenerativeAIModels
ThegenerativeAImodelsarecurrentlyexperiencingrapiddevelopment,withnewmethodscontinuallyemerg-ing.Theevolutionofdeeplearning-basedapproaches,par-ticularlyVariationalAutoencoders(VAE),GenerativeAd-versarialNetworks(GAN),andDiffusionModels(DM),havesignificantlyadvancedandenhancedimagegenerationtechniques.VAEsplayedapioneeringroleindeeplearning-basedgenerativemodels.Theyemployanencoder-decoderarchitectureintegratedwithprobabilisticgraphicalmod-
elstolearnlatentrepresentationsforimagegeneration[6]
.GANsrepresentamilestoneintherealmofimagegener-
GAN:
VAE:
DM:
Figure3.TheframeworkofGAN,VAE,anddiffusionmodels(DM).Wherezisacompressedlow-dimensionalrepresentationoftheinput.
ationwithageneratorandadiscriminator,GANsengageinanadversarialtrainingprocesstopromptthegeneratortogenerateimagesprogressivelyresemblingthedistribu-
tionofrealdata[7,
8]
.Moreover,thediffusionmodelsstandoutasthemostrevolutionarytechnologiesthathaveemergedinrecentyearswithremarkableimagegeneration
quality[9,
10]
2.1.GenerativeAdversarialNetworks
GenerativeAdversarialNetwork(GAN)[11]comprises
ageneratorGandadiscriminatorD,asillustratedinFig-ure
3.TheG
isresponsibleforgeneratingsamplesfornoise
4
z,whiletheDdeterminestheauthenticityofthegeneratedsamplesG(Z)withthegroundtruthimage.Ideally:
D()=1,D(G(z))=0(1)
Thisadversarialnatureenablesthemodeltomaintainady-namicequilibriumbetweengenerationanddiscrimination,propellingthelearningandoptimizationoftheentiresys-tem.Despiteitsadvantages,GANstillfaceschallenges,suchasmodecollapseduringtraining.
ConditionalGANConditionalimagegenerationisanimagegenerationtechniquethatcontrolsthegenerationprocessbyintroducingconditionalinformationtogener-ateimagesthatmatchgivenconditions,suchastext,la-bels,andhand-drawnsketches.Conditionalimagegener-ationintroducesadditionalinputconditions,enablingthegeneratortogenerateimageswithspecificpropertiesbasedonconditionalinformation.ToaddresstheissuethatGANmodelsexhibitlimitedcontrollability,ConditionalGAN
(CGAN)[12]wasintroducedthatusesadditionalauxiliary
informationasaconditiontofine-tuneboththeGandD.TheGofCGANreceivesconditionalinformationbesidesrandomnoise.ByprovidingconditionalinformationtotheG,CGANcanmorepreciselycontrolthegeneratedre-
sults.Additionally,variantssuchaspix2pix[13]andStyle
-
GAN[7]havebeendeveloped.
2.2.DiffusionModels
Inimagegeneration,diffusionmodelsoutperformGANs
andVAEs[14,
15]
.MostdiffusionmodelscurrentlyusedarebasedonDenoisingDiffusionProbabilisticModels
(DDPM)[15]whichsimplifiesthediffusionmodelthrough
variationalinference.AsshowninFigure
3,diffusionmod
-elscontainbothforwarddiffusionprocessandreversede-noising(inference)processes.TheforwardprocessfollowstheconceptofaMarkovchainandturnstheinputimageintoGaussiannoise.Givenadatasamplex0,theGaussiannoiseisprogressivelyIncreasedtothedatasampleduringTstepsintheforwardprocess,producingthenoisysam-plesxt,wherethetimestept={1,...,T}.Astincreases,thedistinguishablefeaturesofx0graduallydiminish.Even-tuallywhenT→∞,xTisequivalenttoaGaussiandis-tributionwithisotropiccovariance.Inaddition,theinfer-enceprocesscanbeunderstoodasasequenceofdenoisingautoencoderswithsameweightsϵθ(xt,t)(ϵθistypically
implementedasU-Net[16]),whicharetrainedtoforecast
denoisedimagesoftheircorrespondinginputsxt.
LatentDiffusionModelDifferentfromDDPM,Latent
DiffusionModel(LDM)[9]doesnotdirectlyoperateon
theimagesbutoperatesinthelatentspace,calledpercep-tualcompression.LDMreducesthedimensionalityofthe
Figure4.Theframeworkofthelatentdiffusionmodel,whichis
proposedbyRombachetal[9]
.
databyprojectingitintoalow-dimensional,efficientlatent
space,inwhichhigh-frequency,imperceptibledetailsare
abstractedaway.TheframeworkofLDMisillustratedin
Figure
4.Aftertheimagexiscompressedbytheencoder
E
tolatentrepresentationz,thediffusionprocessisperformed
onthelatentrepresentationspace.LDMhasasimilardif-
fusionprocesstotheDDPM.Finally,LDMinfersthedata
samplezfromthenoisezTandDrestoresthedataztothe
originalpixelspaceandgetstheresultimagesx.
Specifically,givenanimagex∈RH×W×3withheightH,wigthWinRGBspace,LDMfirstutilizesanencoderEtoencodetheimagexintoalatentrepresentationspace:
z=E(x)(2)
wherez∈Rh×w×cwithheighthandwidthw,theconstantcrepresentsthenumberofchannels.ThenDrecovertheimagefromthelatentrepresentationspace:
=D(z)=D(E(x))(3)
Toacceleratethegenerationspeed,theLatentConsis-
tencyModel(LCM)[17]wasproposedtooptimizethestep
ofdenoisinginference.
2.3.3DGenerativeModels
Inthefieldofthree-dimensionalshapemodeling,im-plicitfunctionsarecommonlyrepresentedinthreeways:OccupancyField,SignedDistanceFunction(SDF),orUn-signedDistanceFunction(UDF),andtherecentlyemergingNeuralRadianceFields(NeRF).
3DShapeRepresentationRepresentationin3Dvisualproblemscangenerallybedividedintofourcategories:voxel-based,pointcloud-based,mesh-based,andimplicitrepresentation-based.
Voxel.AsshowninFig
5a.
Thevoxelformatdescribesa3Dobjectasamatrixofvolumeoccupancy,wherethe
sizeofthematrixisfixed.Researchers[18]adoptedvoxel
5
(a)Voxel(b)Point(c)Mesh(d)Implicit
Figure5.Representationexamplesof3Dshapesfrom[24]
.
representationinthegenerationof3Dshapes.Voxelfor-matrequireshighresolutiontodescribefine-graineddetails,soastheshaperesolutionincreases,thecomputationalcostalsoexplodes.Thereconstructionresultsofvoxel-basedre-searcharelimitedinresolutionanddonotprovidetopolog-icalguaranteesorrepresentsharpfeatures.
PointCloud.Asshownin
5b.Pointcloudsarealightweight
3Drepresentationcomposedof(x,y,z)coordinatevalues.Pointcloudsareanaturalwaytorepresentshapes.Point-
Net[19]extractsglobalshapefeaturesusingthemax-set
operations,anditisusedwidelyasanencoderforpoint-
basedgenerativenetworks[20].However,pointcloudsdo
notrepresenttopologyandareunsuitableforgeneratingwa-tertightsurfaces.
Mesh.Asshownin
5c
meshesarewidelyusedandcon-
structedfromverticesandfaces.[21]deformedapre
-definedtemplatetorestrictafixedtopologyusinggraphconvolution.Recently,meshesareusedtorepresentshapes
indeeplearningtechniques[22]
.Althoughmeshesaremoresuitablefordescribingthetopologicalstructureofob-jects,theyusuallyrequireadvancedpreprocessingsteps.
Implicit.Asshownin
5d,implicitrepresentationrefersto
describingasurfacewithazero-crossingpointofavolumefunctionψ:R3→R,whosevaluecanbeadjusted.Repre-sentinga3Dshapeasasetoflevelsetsofadeepnetwork,
mapping3Dcoordinatestoasigneddistancefunction[23]
oroccupancyfield[24].Implicitrepresentationcancreate
alightweight,continuousshaperepresentationwithnores-olutionlimits.
OccupancyFieldOccupancyFieldisoneoftheimplicit
functionmethodsbasedondeeplearning[24]
.Occu-pancyFieldassignsbinaryvaluestoeachpointinthree-dimensionalspace,determiningwhetherthepointisoccu-piedbyanobject.Thisapproachutilizesneuralnetworkstolearntherepresentationofoccupancyfields,facilitatinghighlydetailedthree-dimensionalreconstruction.Thead-vantageofOccupancyFieldliesinitsdynamicmodelingofobjectoccupancyinscenes,makingitsuitableforhandlingcomplexthree-dimensionalenvironments.
SDF.BuildinguponOccupancyField,theSignedDistance
Figure6.
DeepSDF[23]representationappliedtotheStanford
Bunny:(a)depictionoftheunderlyingimplicitsurfaceSDF=0trainedonsampledpointsinsideSDF<0andoutsideSDF>0thesurface,(b)2Dcross-sectionofthesigneddistancefield,(c)rendered3DsurfacerecoveredfromSDF=0.Notethat(b)and(c)arerecoveredviaDeepSDF.
Function(SDF)hasbecomeacrucialdirectioninimplicitfunctionrepresentationwithindeeplearning.SDFassignsasigneddistancevaluetoeachpoint,indicatingtheshort-estdistancefromthepointtotheobject’ssurface.Positivevaluessignifypointsoutsidetheobject,whilenegativeval-uesindicatepointsinsidetheobject.AsshowninFigure
6.
DeepSDF[23]providesanend-to-endapproachforcontin
-uousSDFlearning,enablingprecisemodelingofirregularshapesandlocalgeometry.
UDF.UDFandSDFaretwodistinctyetinterrelatedim-plicitfunctionrepresentationapproaches.UDFassignsanunsigneddistancevaluetoeachpoint,representingthedis-tancetothenearestsurfacewithoutconsideringsurfacedi-rection.UDFisparticularlyusefulforcapturingmoreintu-itivesurfacedistanceinformationwithoutinvolvingdirec-
tionalaspects.Zhaoetal.[26]contributesignificantlyby
jointlyexploringthelearningofbothsignedandunsigneddistancefunctions.Thisapproachaimstoenrichtheex-pressivenessofimplicitfunctions,simultaneouslycapturingintricatedetailsthroughbothsignedandunsigneddistanceinformation.
NeRF.
NeuralRadianceFields(NeRF)[25]haverevolu
-tionizedthefieldofcomputervisionandgraphicsbyintro-ducinganovelapproachtoscenerepresentation.AsshowninFigure
7.AttheheartofNeRFliestheconceptofrepre
-sentingasceneasacontinuousfunctioncapturingradianceinformationateverypoint.Thefundamentalequationdriv-ingNeRFistherenderingequation,mathematicallyformu-latingtheobservedradiancealongaviewingray.TheNeRFformulationisexpressedas:
C(p)=lT(pt)·σ(pt)·L(pt,−d)dpt
WhereC(p)representstheobservedcoloratpointp,ptrepresentspointsalongtheviewingray,T(pt)isthetrans-mittancefunction,σ(pt)representsvolumedensity,and
6
Figure7.AnoverviewofNeRFscenerepresentationanddifferentiablerenderingprocedure
[25].Synthesizingimagesbysampling5D
coordinates(locationandviewingdirection)alongcamerarays(a),feedingthoselocationsintoanMLPtoproduceacolorandvolumedensity(b),andusingvolumerenderingtechniquestocompositethesevaluesintoanimage(c).Andminimizetheresidualbetweensynthesizedandgroundtruthobservedimages(d).
L(pt,−d)representsemittedradiance.NeRFintroducesanimplicitrepresentation,enablingtheencodingofdetailedandcontinuousvolumetricinformation.Thisallowsforhigh-fidelityreconstructionandrenderingofsceneswithfine-scalestructures,surpassingthelimitationsofexplicit
representations.Recently,3DGaussianSplatting[27]isin
-troducedbyprojecting3Dinformationontoa2DdomainusingGaussiankernels,andachievedbetterperformancethanNeRF.
2.4.FoundationModels
Incomputerscience,foundationmodelsalsocalledlarge-scalemodelsusedeeplearningmodelswithnumer-ousparametersandintricatestructures,particularlyinnat-urallanguageprocessingandcomputervisiontasks.Thesemodelsdemandsubstantialcomputationalresourcesfortrainingbutexhibitexceptionalperformanceacrossdiversetasks.Theevolutionfrombasicneuralnetworkstosophis-ticateddiffusionmodels,asdepictedinFigure
8,illustrates
thecontinuousquestformorerobustandadaptableAIsys-tems.
2.4.1LargeLanguageModels(LLM)
Transformer.TheTransformermodelhasachievedremark-ablesuccessinnaturallanguageprocessing(NLP)whichconsistsofseveralcomponents:encoder,decoder,posi-tionalEncoding,andthefinallinearandsoftmaxlayers.Boththeencoderanddecoderarecomposedofmultipleidenticallayers.Eachlayercontainsseveralcomponentsofattentionlayersandfeedforwardnetworklayers.Addition-ally,positionalencodingisusedtoinjectpositionalinfor-mationintothetextembeddings,indicatingthepositionofwordswithinthesequence.Notably,TransformerhaspavedthewayfortwoprominentTransformermodels:Bidirec-tionalEncoderRepresentationsfromTransformers(BERT)
[28]andGenerativePre-trainedTransformer(GPT)[29]
.ThemaindifferenceisthatBERTisbasedonabidirectionalpre-traininglanguagemodelandfine-tuning,whileGPTisbasedonanautoregressivepre-traininglanguagemodelandprompting.
GPT.GPTaimstopre-trainmodelsusinglarge-scaleun-supervisedlearningtofacilitateunderstandingandgener-ationofnaturallanguage.Thetrainingprocessinvolvestwoprimarystages:Initially,alanguagemodelistrainedinanunsupervisedmanneronextensivecorporawithouttask-specificlabelsorannotations.Subsequently,super-visedfine-tuningoccursduringthesecondstage,cateringtospecificapplicationdomainsandtasks.
BERT.BERThasemergedasabreakthroughapproach,achievingstate-of-the-artperformanceacrossdiverselan-guagetasks.BERT’strainingmethodologycomprisestwokeystages:pre-trainingandfine-tuning.Pre-trainingin-volvestheutilizationofextensivetextcorporatotrainthelanguagemodel.Theprimaryobjectiveofpre-trainingistoendowtheBERTmodelwithrobustlanguageunder-standingcapabilities,enablingittoeffectivelytacklevar-iousnaturallanguageprocessingtasks.Subsequently,fine-tuningutilizesthepre-trainedBERTmodelinconjunctionwithsmallerlabeleddatasetstorefinethemodelparame-ters.Thisprocessfacilitatesthecustomizationofthemodeltospecifictasks,therebyenhancingitssuitabilityandper-formancefortargetedapplications.
Inrecentyears,LLMshavewitnessedexplosiveandrapidgrowth.Basiclanguagemodelsrefertomodelsthatareonlypre-trainedonlarge-scaletextcorpora,withoutany
fine-tuning.ExamplesofsuchmodelsincludeLaMDA[30]
andOpenAI’sGPT-3[31]
.
7
OtherDiffusionLarge-VisualNLPLarge-Language
MethodsMethodsModelsMethodsModels
Before2014201520162017201820192020202120222023
Gemini
CLIP
ERNIEBot
BART
BERTT5
GPT-1GPT-2GPT-3GPT-4
CogView
Classifier-FreePrompt-to-
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 湖北第二师范学院《体育管理学》2022-2023学年第一学期期末试卷
- 2024百香果买卖合同书
- 2024简易工程机械租赁合同范本
- 2024天猫旗舰店转让合作合同(律师拟定版本)
- 湖北大学知行学院《食品包装学》2023-2024学年第一学期期末试卷
- 2024股权抵押借款合同模板
- 《多元相关与回归》课件
- 2024小额贷款公司抵押合同范本
- 2024成套设备进口合同书模板(合同版本)
- 李一环连锁总部门店标准化建立:店长标准化复制与督导标准化手册
- 物流专业个人能力展示
- 五年级上册小数除法竖式计算练习300题及答案
- 大学生职业规划数据分析师
- 技改方案范文
- 县人民医院关于职工工资与绩效等待遇的规定
- 农村自建房施工安全措施方案
- 护理产业与行业分析
- 征地拆迁安置区市政配套设施工程测绘服务公开选取测绘招投标书范本
- 《我的祖国》课件
- 小学一年级上学期思维训练数学试题(答案)
- 听风八百遍才知是人间
评论
0/150
提交评论