如何评估恶意使用先进人工智能系统的可能性 How to Assess the Likelihood of Malicious Use of Advanced Al Systems_第1页
如何评估恶意使用先进人工智能系统的可能性 How to Assess the Likelihood of Malicious Use of Advanced Al Systems_第2页
如何评估恶意使用先进人工智能系统的可能性 How to Assess the Likelihood of Malicious Use of Advanced Al Systems_第3页
如何评估恶意使用先进人工智能系统的可能性 How to Assess the Likelihood of Malicious Use of Advanced Al Systems_第4页
如何评估恶意使用先进人工智能系统的可能性 How to Assess the Likelihood of Malicious Use of Advanced Al Systems_第5页
已阅读5页,还剩19页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

CenterforSecurityandEmergingTechnology|1

ExecutiveSummary

Policymakersaredebatingtherisksthatnewadvancedartificialintelligence(AI)

technologiescanposeifintentionallymisused:fromgeneratingcontentfor

disinformationcampaignstoinstructinganovicehowtobuildabiologicalagent.

Becausethetechnologyisimprovingrapidlyandthepotentialdangersremainunclear,assessingriskisanongoingchallenge.

Malicious-userisksareoftenconsideredtobeafunctionofthelikelihoodandseverityofthebehaviorinquestion.WefocusonthelikelihoodthatanAItechnologyis

misusedforaparticularapplicationandleaveseverityassessmentstoadditionalresearch.

TherearemanystrategiestoreduceuncertaintyaboutwhetheraparticularAIsystem

(callitX)willlikelybemisusedforaspecificmaliciousapplication(callitY).We

describehowresearcherscanassessthelikelihoodofmalicioususeofadvancedAIsystemsatthreestages:

1.Plausibility(P)

2.Performance(P)

3.Observeduse(Ou)

PlausibilitytestsconsiderwhethersystemXcandobehaviorYatall.Performance

testsaskhowwellXcanperformY.InformationaboutobservedusetrackswhetherXisusedtodoYintherealworld.

Familiaritywiththesethreestagesofassessment—includingthemethodsusedateachstage,alongwiththeirlimitations—canhelppolicymakerscriticallyevaluate

claimsaboutAImisusethreats,contextualizeheadlinesdescribingresearchfindings,andunderstandtheworkofthenewlycreatednetworkofAIsafetyinstitutes.

ThisIssueBriefsummarizesthekeypointsin:JoshA.GoldsteinandGirishSastry,

“The

PPOuFramework:AStructuredApproachforAssessingtheLikelihoodofMalicious

UseofAdvancedAISystems,

”ProceedingsoftheAAAI/ACMConferenceonAI,Ethics,andSociety7,no.1(2024):503–518.

CenterforSecurityandEmergingTechnology|2

Introduction

Concernsaboutbadactorsintentionallymisusingadvancedartificialintelligence(AI)systemsareprevalentandcontroversial.TheseconcernsareprevalentastheyreceivewidespreadmediaattentionandarereflectedinpollsoftheAmericanpublicaswellasinpronouncementsandproposalsbyelectedofficials

.1

Yettheyarecontroversialbecauseexperts—bothinsideandoutsideofAI—expresshighlevelsofdisagreementabouttheextenttowhichbadactorswillmisuseAIsystems,howusefulthese

systemswillbecomparedtonon-AIalternatives,andhowmuchcapabilitieswillchangeinthecomingyears

.2

ThedisagreementaboutmisuserisksfromadvancedAIsystemsisnotmerely

academic.Claimsaboutriskareoftencitedtosupportpolicypositionswithsignificantsocietalimplications,includingwhethertomakemodelsmoreorlessaccessible,

whetherandhowtoregulatefrontierAIsystems,andwhethertohaltdevelopmentofmorecapableAIsystems

.3

Ifviewsofmisuseriskswillinformpolicy,itiscriticalfor

policymakerstounderstandhowtoevaluatemalicious-useresearch.

Inanewpaper

“ThePPOuFramework:AStructuredApproachforAssessingthe

LikelihoodofMaliciousUseofAdvancedAISystems,

”publishedintheProceedingsoftheSeventhAAAI/ACMConferenceonAI,Ethics,andSociety,weprovideaframeworkforthinkingthroughthelikelihoodthatanadvancedAIsystem(callitX)willbe

misusedforaparticularmaliciousapplication(callitY)

.4

Theframeworklaysoutthreedifferentstagesforassessingthelikelihoodofmalicioususe:

1.Plausibility(P):CansystemXperformmaliciousbehaviorYevenonce?

2.Performance(P):HowwellcansystemXperformmaliciousbehaviorY?

3.Observeduse(Ou):IssystemXusedformaliciousbehaviorYintherealworld?

Onceapotentialmisuseriskhasbeenidentified,researcherscaninvestigatetheriskateachstageoutlinedinFigure1.Thefigurealsosummarizesthekeymethodologiesandchallengesateachstage.

CenterforSecurityandEmergingTechnology|3

Figure1.ThePPOuFrameworkSummary

Researchateachofthesethreestagesaddressesdifferentformsofuncertainty.For

example,whiledemonstrationsattheplausibilitystagemaybeabletoshowthat

systemXcanbeusedforbehaviorY(orabehaviorsimilartoY)once,theywillleaveuncertaintyabouthowusefulXwouldbeforpotentialbadactors.Riskassessmentsattheperformancestagecanhelpmodelthemarginalutilityforbadactors,butactual

useofXbybadactorsmaydifferfromresearchexpectations.Observedusecantrack

CenterforSecurityandEmergingTechnology|4

actualapplicationstorightsizeexpectations,butitwillnotdeterminehowfuturesystemscouldbemisusedorwhetherXwillbeusedforvariantsofYinthefuture.

Wehopethatbylayingoutthesestages,wewillprovidepolicymakerswithabetterunderstandingofthetypesofuncertaintyaboutmalicioususeandwhereresearch

can—andcannot—plugin.InFigure2,weprovideexamplesofthetypesofquestionsresearcherscouldaskateachstagefromthreeriskareas:politicalmanipulation,

biologicalattacks,andcyberoffense.

CenterforSecurityandEmergingTechnology|5

ThePPOuFramework

Figure2.StagesofthePPOuFrameworkandExampleQuestions

Stage

Examplefrompolitical

manipulation

Examplefrom

biologicalattacks

Examplefromcyberoffense

Plausibility

Couldamultimodalagentgenerateanddistributeashort

videoofpartisansinterferingintheelectoralprocess?

*

Couldachatbot

guidea(resourced)undergradthroughtheprocessof

creatingaCategoryBpotential

bioterrorism

agent?

5

Couldalarge-language-model-basedsoftware

agentidentifyandproduce(butnotnecessarily

deliver)aworkingexploitforawidelyusedpieceofsoftware?

Performance

Howrealistic,reliable,andcost-effectivearemultimodalmodelsatgeneratingand

distributingvideosof

partisansattemptingtointerfereinthe

electionprocess?

Howmuchupliftdoesthechatbotprovideover

existingbiologicaldesigntools?

Howmuchdoesitcostto

operatetheagentto

producetheexploit

comparedtoasimilarlyskilledhuman?

Observeduse

Dopeopleuse

multimodalmodelstogenerateand

distributevideosof

partisansattemptingtointerfereinthe

electoralprocess,inpractice?

Dorequest-

responselogs

indicatethatauserisapplyinga

chatbottoguide

themthrough

creatingapotentialbioterrorismagent?

Istherechatteroncriminalforumsthatpeopleare

experimentingwithsuchanagent?

*Onagents,see:HelenToner,JohnBansemer,KyleCrichtonetal.,“ThroughtheChatWindowandInto

theRealWorld:PreparingforAIAgents,”CenterforSecurityandEmergingTechnology,October2024,

/publication/through-the-chat-window-and-into-the-real-world-preparing-

for-ai-agents/.

CenterforSecurityandEmergingTechnology|6

Stage1:Plausibility:CansystemXperformmaliciousbehaviorYevenonce?

ThesimplestwaytotestwhetherthereisariskofsystemXbeingusedformaliciousbehaviorYistoseeifXcandoY,justonce.Red-teamersandstresstestersadoptanadversary’smindsetandprobeanAIsystemfor“identificationofharmfulcapabilities,outputs,orinfrastructurethreats.

”6

Ifamodeldoesnotproduceharmfulbehavioronthefirsttry,thenextstepistoiterate.Researchersusedifferenttechniques,includingimprovingprompts(theinputfedtothemodel,suchasinstructionsorexamples)orfine-tuning(asmallamountofadditionaltraining)amodelforthespecificbehavior.

IfXstillfailstoexhibitY,despiteemployingvarioustechniquesandtricks,thequestion

ThesimplestwaytotestwhetherthereisariskofsystemXbeingusedformaliciousbehaviorYis

toseeifXcandoY,justonce.

naturallybecomes:Howlongshouldonecontinuetrying?Whileresearcherscan

demonstratethatsystemXcanbeusedformalicioususeYbyshowinganexample,

provingthenegative(thatsystemXcannotdoY)ismorechallenging.

BecauseAImodelsareoftengeneral-purposeandourabilitytopredicttheir

capabilitiesisstilladvancing,wemaynotknowwhethersystemXcannotperformY,orwhetherthepromptingstrategiesusedhavebeeninsufficient.Thisisknownasthe

“capabilityelicitationproblem.”Forexample,onepaperfoundthatsimplyprepending“Takeadeepbreath”beforearequestedtaskimprovedperformance

.7

AnalystsmaythusconcludethatasystemcouldplausiblydoYifitgetsclose,withinacertain

marginoferror(knownasa“safetymargin”),toaccountforpossiblegainsfrombetterelicitationtechniques

.8

Thedeterminationaboutwhatqualifiesascloseenoughisa

matterofjudgment.

Toscaleupred-teamingefforts,AIdeveloperscanusebothhumansandmachines.LeadingAIlabsarehiringexternalcontractorstored-teamtheirmodels,allowing

themtoaugmenttheexpertise(andlaborhours)theypossessin-house

.9

ResearchersarealsodevelopingwaystouseAImodelstored-teamothermodels,whichisa

promisingdirectionforfutureresearch

.10

Stage2:Performance:HowwellcansystemXperformmaliciousbehaviorY?

Showingthatasystemcandothemaliciousbehavioronce(Stage1)doesnotmeanitisnecessarilyausefultoolfordoingso.Plausibilitystillleavesalotofuncertainty,

includingaboutthequalityoftheoutput,thereliabilityofthesystem,andhowusefulitiscomparedtoalternativesthatmayhaveexistedbefore.Asaresult,researchat

CenterforSecurityandEmergingTechnology|7

Stage2focusesonaddressingtheseandrelatedquestions,aimingtoreduceuncertaintyabouttheutilityofthesysteminquestion,oftenthroughstaticbenchmarks,experiments,ormodelingmarginalutility.

First,similartotakingawrittentest,researcherswilltestAImodelsagainst

predeterminedsetsofquestions.Forexample,canamodelrecognizeimages?SolvePhD-levelmathproblems?Answerquestionsaboutthelaw?Ifresearchershavea

standardizedsetofquestions,theycancontinuallytestnewmodelsagainstthat

benchmarkasmodelsaredeveloped

.11

Thiscanprovidecomparabilitybetween

modelsatrelativelylowcost.Recently,researchershavealsobeenbuilding

benchmarksforpotentiallyharmfulapplications

.12

Inpractice,however,static

benchmarkscanbedifficulttocreate.Sometimestheyare“polluted”becausethe

modelwasalreadyprovidedwiththeansweraspartofitstrainingset

.13

Othertimes,constructingthemcanbelabor-intensive,necessitatespecializedknowledge,or

requireaccesstoclassifiedmaterial.

Asecondapproachistoconductexperiments,

deliberatelyintroducinganAIsystemorpieceofAI-generatedcontentinanenvironmentto

determineitseffectonoutcomesofinterest.ThiscouldrangefromusingAIinpenetration-testingexercisestousingAItoconvincepeopleagainstconspiracybeliefs

.14

Inlabandsurvey

experiments,researcherscanstudytheeffectsofdifferenttreatments,recruitrespondentsusingexistingpools,andstraightforwardlyensure

Stage2aimstoreduce

uncertaintyabouttheutilityofthesysteminquestion,oftenthroughstatic

benchmarks,experiments,ormodelingmarginal

utility.

informedconsent.However,forassessingmalicious-userisks,researcherswillstill

facelimitationsbecauseofthedutytominimizeharmtorespondents.ThisisespeciallyacuteforfieldexperimentsthattesttheeffectsofanAIsystemintherealworld.

Finally,researchersmaytestforuplift—thatis,howusefulatooliscomparedtoasetofalternatives

.15

Ifasystemcanreliablyproduceinstructionsfordesigninga

bioweapon,butaGooglesearchcoulddothesame,thentheupliftislimited.These

marginalutilitytestsrequireestablishingarelevantsetofalternatives(whichwillvarybasedonthethreatmodel)andoutcomesofinterest.Forexample,ifthegoalisto

assesswhetherlanguagemodelswillbeusefulforpropaganda,uplifttestswould

requireestablishingbaselines(human-writtenpropaganda)andoutcomesofinterest,suchasthecostofrunningacampaignorthenumberofpeoplepersuadedtotakeanaction

.16

CenterforSecurityandEmergingTechnology|8

Performancetestingcouldimproveinseveralways.First,researcherscouldtestfortheequivalentof“scalinglaws”inthemalicious-usedomain

.17

Inotherwords,howmuchriskierinamalicious-usedomaindomodelsbecomewithcertaincapability

improvements(orscaleincreases)?Fromaninstitutionalperspective,thefieldcan

continuetodeveloparrangementsthatminimizeincentiveissues.AIlabsmayhavethebestcapabilityelicitationtechniques,buttheyalsohaveincentivestosandbagornotthoroughlytesttheirownsystems

.18

Inthefuture,governmententitiessuchasAI

safetyinstitutescouldconductasubsetofmalicious-useriskstoensuretestingoccurs

.19

Stage3:ObservedUse:IssystemXusedformaliciousbehaviorYintherealworld?

InStages1and2,researcherscantestwhethersystemXcanbeusedformaliciousbehaviorYandinvestigatehowusefulXmaybe.Whileforecastspriortodeployment

Theobservedusestageshiftsawayfrom

focusingonprojectedscenariostodiscoveringhowbadactorsmisuse

AIsystemsintherealworld.

estimatehowlikelysystemXistobemisused,

researchexpectationsandactualmisusemaydiverge.Policymakersmustrecognizethatpre-deployment

researchmaymisestimaterisksduetocognitivebiases(e.g.,analystsprojectingtheirownassumptionsontoadversaries)orunforeseencapabilities(e.g.,emergentabilitiesofAIsystemsdiscoveredafterdeployment).Historicalexamples—suchasthemisjudgedthreatsofcyberattacksinthe1990s—highlighthowanticipatedriskscandifferfromactualmisuse

.20

Theobservedusestageshiftsawayfromfocusingonprojectedscenariosto

discoveringhowbadactorsmisuseAIsystemsintherealworld.Thisisoften

challenging,asactorsmisusingtoolsmaydeliberatelyobscuretheiractivitiesduetoreputationalrisks,legalconcerns,orfearsthatexposurecouldunderminetheir

effectiveness.

OnemethodforuncoveringmisuseismonitoringbyAIprovidersthemselves.For

example,companiesthatmaketheirnewsystemsavailablethroughanapplication

programminginterfacecouldmonitorrequestsandresponses.OpenAIhasrecentlyreleasedseveralreportsdescribinginfluenceoperationsmisusingitstools

.21

MonitoringbyAIproviderscanbeaneffectivestrategy,becauseitcanexposemisuseearlyinanoperation—forexample,aftercovertpropagandistsbegincreatingcontent,butbeforetheybuildlargefollowings.However,thisstrategyalsofacestrade-offs

relatedtouserprivacy.

CenterforSecurityandEmergingTechnology|9

AsecondroutefordiscoverycomesfromoutsidetheAIcompanies.Open-source

researchersandjournalistscanuncovertheuseofXformaliciousbehaviorY.PotentialroutestodiscoveryincludefindingevidenceofbadactorsusingAIintheirworkflows,interviewingthemtoaskhowtheyuseadvancedAIsystems,monitoringdiscussionsoncriminalforums,andmore.Thenewsoutlet404Mediahasuncoveredarangeof

applicationsofAIonline—includingspamandscams—demonstratingtheroleofjournalistswhocloselytrackonlinedevelopments

.22

Last,researcherscandevelopincidentdatabasestomovebeyondsinglecasestudiesandbetterunderstandpatternsofabuse.TheAIIncidentDatabaseandthePoliticalDeepfakesIncidentsDatabasearetwoongoingefforts.ThosebuildingAIincident

databasescandrawvaluablelessonsfromfieldswithestablishedincidentreportingsystems,suchasairlinesafety,includingconsiderationsofthetrade-offsbetweenvoluntaryandmandatorydisclosure

.23

BecausetheapplicationofXformalicioususeYmaybeintentionallyhidden,theobservedusestagefacesseverallimitations.First,

observationaldataaboutmisusemaynotbe

representativeofthebroaderuniverseofcases,leadingtofaultyconclusionsaboutareasthat

Policymakersshouldbe

carefulnottoover-index

onthemisusethatismosteasilycountable.

requireheightenedattention.Policymakersshouldbecarefulnottoover-indexonthemisusethatismosteasilycountable.Furthermore,evenifobservedusetodayis

representative,itmaynotprojectintothefuture.Newcapabilityelicitationtechniquesorimprovementsinmodelcapabilitiescanleadtosubstantiallydifferentmisusesofsubsequentgenerationsofsystems.Futureeffortscouldempowerexternal

researcherstoworkwithAIcompaniestobetterunderstandmisuse,developincreasinglycapableclassifiersformalicioususe,andscalemonitoringefforts.

CenterforSecurityandEmergingTechnology|10

Conclusion

Eachstageoftheframework—plausibility,performance,andobserveduse—attemptstoreduceuncertaintyaboutthelikelihoodofmisuseofanadvancedAIsystem.Canthemodelperformtheharmfulbehavior,justonce?Howwelldoesitdoso,andhow

usefulisittobadactors?Whatistheexistingevidenceofbadactorsmisusingthetool,orsimilarones,forthisapplicationintherealworld?

Forpolicymakers,thesequestionscanbeusefulwhenencounteringclaimsabout

misuserisks.Forexample,imaginecomingacrossaheadlinethatreads:“AIChatbotsCanGiveInstructionsforCreatingBioweapons.”UsingthePPOuFramework,a

discerningpolicymakermayaskwhetherthatisaplausibilityassessment(e.g.,red-teamingfoundinstructionsonce)oraresultofperformancetesting(e.g.,researchersfoundthesystemcouldgeneratevalidinstructionsreliably).Thepolicymakerthen

mightsearchforfurtherinformation:Howgoodisthechatbot,andcomparedtowhatbaseline?Justbecauseasystemcanbeusedforaparticularmaliciouspurposedoesnotmeanitwillbeintherealworld.PolicymakerscanusethePPOuFrameworkasaguide,whilerecognizingthatsomedegreeofuncertaintyaboutthelikelihoodof

malicioususewillalwaysremain.

Thediversesetofmethodologiesalsohighlightsthatawiderangeofexpertscan

contribute.AsAImodelsbecomeadvancedandhavewiderapplications,buildingupariskassessmentecosystemwillgrowmoreimportant.TheU.S.governmentshould

seektheadviceof,andprovidefundingsupportfor,researcherswithdifferent

substantiveexpertise(suchasmisusedomainsofinterest)aswellasdifferent

methodologicaltraining(forexample,machinelearningorhuman-subject

experiments).IfthenetworkofAIsafetyinstitutesprogresses,it,too,willbestrongerforsolicitingcooperationfromalargetent.

CenterforSecurityandEmergingTechnology|11

Authors

JoshA.GoldsteinisaresearchfellowatGeorgetownUniversity’sCenterforSecurityandEmergingTechnology(CSET),whereheworksontheCyberAIproject.

GoldsteiniscurrentlyonrotationasapolicyadvisorattheCybersecurityand

InfrastructureSecurityAgency(CISA)underanInterdepartmentalPersonnelActagreementwithCSET.HecompletedthisworkbeforestartingatCISA.Theviewsexpressedaretheauthor’sownpersonalviewsanddonotnecessarilyreflecttheviewsofCISAortheDepartmentofHomelandSecurity.

GirishSastryisanindependentpolicyresearcher.

Reference

Thispolicymemosummarizesanoriginalresearchpaper:JoshA.GoldsteinandGirishSastry,

“ThePPOuFramework:AStructuredApproachforAssessingtheLikelihoodof

MaliciousUseofAdvancedAISystems,

”ProceedingsoftheAAAI/ACMConferenceonAI,Ethics,andSociety7,no.1(2024):503-518.

Acknowledgments

Forresearchassistancesupport,wethankAbhiramReddy.Forfeedbackonthe

underlyingpaper,wethankLamaAhmad,MarkusAnderljung,JohnBansemer,EdenBeck,RosieCampbell,DerekChong,JessicaJi,IgorMikolic-Torreira,AndrewReddie,ChrisRohlf,ColinShea-Blymyer,WeiyanShi,TobyShevlane,ThomasWoodside,andparticipantsattheNLPSoDaSConference2023.

©2025bytheCenterforSecurityandEmergingTechnology.ThisworkislicensedunderaCreativeCommonsAttribution-NonCommercial4.0InternationalLicense.

Toviewacopyofthislicense,visit

/licenses/by-nc/4.0/.

DocumentIdentifier:doi:10.51593/20240042

CenterforSecurityandEmergingTechnology|12

Endnotes

1TaylorOrthandCarlBialik,“MajoritiesofAmericansAreConcernedabouttheSpreadofAIDeepfakesandPropaganda,”YouGov,September12,2023,

/technology/articles/46058-

majorities-americans-are-concerned-about-spread-ai;

OfficeofCongressmanDonBeyer,“Ross,BeyerIntroduceLegislationtoRegulateArtificialIntelligenceSecurity,MitigateRiskIncidents,”newsrelease,September23,2024,

/news/documentsingle.aspx?DocumentID=6304.

2SayashKapooretal.,“OntheSocietalImpactofOpenFoundationModels,”arXivpreprint

arXiv:2403.07918

(2024),

/abs/2403.07918;

YoshuaBengioetal.,“InternationalScientificReportontheSafetyofAdvancedAI(InterimReport),”arXivpreprint

arXiv:2412.05282

(2024),

/abs/2412.05282.

3IreneSolaiman,“TheGradientofGenerativeAIRelease:MethodsandConsiderations,”inFAccT:

Proceedingsofthe2023ACMConferenceonFairness,Accountability,andTransparency(NewYork:

AssociationforComputingMachinery,2023),

/10.1145/3593013.3593981;

Yoshua

Bengioetal.,“ManagingExtremeAIRisksAmidRapidProgress,”Science384,no.6698(2024):842–

845,

/doi/10.1126/science.adn0117;

YoshuaBengioetal.,“PauseGiantAI

Experiments:AnOpenLetter,”FutureofLifeInstitute,March22,2023,

/open-

letter/pause-giant-ai-experiments/.

4JoshA.GoldsteinandGirishSastry,“ThePPOuFramework:AStructuredApproachforAssessingtheLikelihoodofMaliciousUseofAdvancedAISystems,”ProceedingsoftheSeventhAAAI/ACM

ConferenceonAI,Ethics,andSociety7,no.1(2024):503–518,

/10.1609/aies.v7i1.31653.

5“PotentialBioterrorismAgents,”DepartmentofMolecularVirologyandMicrobiology,BaylorCollege

ofMedicine,accessedNovember22,2024,

/departments/molecular-virology-and-

microbiology/emerging-infections-and-biodefense/potential-bioterrorism-agents.

6“IssueBrief:WhatIsRedTeaming?”(FrontierModelForum,October27,2023),

/updates/red-teaming/.

7ChengrunYangetal.,“LargeLanguageModelsasOptimizers,”12thInternationalConferenceonLearningRepresentations(ICLR2024),

/pdf?id=Bb4VGOWELI.

8MaryPhuongetal.,“EvaluatingFrontierModelsforDangerousCapabilities,”arXivpreprint

arXiv:2403.13793

(2024),

/abs/2403.13793.

9“OpenAIRedTeamingNetwork,”OpenAI,September19,2023,

/index/red-teaming-

network/.

10AlexBeuteletal.,“DiverseandEffectiveRedTeamingwithAuto-generatedRewardsandMulti-stepReinforcementLearning,”arXivpreprint

arXiv:2412.18693

(2024),

/abs/2412.18693.

11“BrowseState-of-the-Art,”PaperswithCode,accessedNovember22,2024,

/sota.

CenterforSecurityandEmergingTechnology|13

12See,forinstance,ManishBhattetal.,“CyberSecEval2:AWide-RangingCybersecurityEvaluationSuiteforLargeLanguageModels,”arXivpreprint

arXiv:2404.13161

(2024),

/abs/2404.13161.

13AndyK.Zhangetal.,“LanguageModelDevelopersShouldReportTrain-TestOverlap,”arXivpreprint

arXiv:2410.08385

(2024),

/abs/2410.08385.

14ThomasH.Costello,G

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论