![克服难题推进人工智能实践+Overcoming+the+Hard+Problems+to+Advance+AI+Practice_第1页](http://file4.renrendoc.com/view12/M0A/3A/3F/wKhkGWZqOS2Af0JqAAI7aLb-E50721.jpg)
![克服难题推进人工智能实践+Overcoming+the+Hard+Problems+to+Advance+AI+Practice_第2页](http://file4.renrendoc.com/view12/M0A/3A/3F/wKhkGWZqOS2Af0JqAAI7aLb-E507212.jpg)
![克服难题推进人工智能实践+Overcoming+the+Hard+Problems+to+Advance+AI+Practice_第3页](http://file4.renrendoc.com/view12/M0A/3A/3F/wKhkGWZqOS2Af0JqAAI7aLb-E507213.jpg)
![克服难题推进人工智能实践+Overcoming+the+Hard+Problems+to+Advance+AI+Practice_第4页](http://file4.renrendoc.com/view12/M0A/3A/3F/wKhkGWZqOS2Af0JqAAI7aLb-E507214.jpg)
![克服难题推进人工智能实践+Overcoming+the+Hard+Problems+to+Advance+AI+Practice_第5页](http://file4.renrendoc.com/view12/M0A/3A/3F/wKhkGWZqOS2Af0JqAAI7aLb-E507215.jpg)
版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
AIPRACTICE
OvercomingtheHardProblemstoAdvance
AIPractice
TakingdataanalyticstoamoreadvancedlevelwithAItools
meansconfrontingtherisksandpitfallsofmachinelearning
algorithms.
Sponsoredby:
Reallearning
Realimpact
SUMMER2024
SPECIALREPORT
[SpecialReport]
OvercomingtheHardProblemstoAdvanceAIPractice
A
sexcitementaroundlargelanguage
models(LLMs)spursspendingonAI,thesalientquestionforbusinessleaders
remains,Whatisthereturnonourdatascienceinvestments?Inthenearterm,advancedanalyticsandmachinelearn-ingaretheworkhorsetechnologiesfor
creatingsignificantvaluefromdataassets.Notthatdoingsoiseasy;companiesfacenumerouschal-
lengesalongtheway.
MuchAIriskbecomesapparentwhensystems
areinproduction,sotrulyresponsibleAIisn’tjustaconcernatthefrontendofthedevelopmentpro-
cess.CathyO’Neil,whoposedhardquestionsabouttheunintendedconsequencesofalgorithmicdeci-
sion-makinginher2016book,WeaponsofMath
Destruction,haspioneeredthepracticeofalgo-
rithmicauditing.O’NeilandcoauthorsJakeAppel
andSamTyner-Monroewalkreadersthroughtheirapproachanddiscusshowitcanbeappliedtogener-ativeAItoolsaswell.
Thetrade-offbetweenusingdataforinsights
andprotectingcustomers’personaldatagrowsonlymoredifficultasbadactorsimprovetheirtechniquesforre-identifyinganonymizeddatasets.Gregory
Vial,JulienCrowe,andPatrickMesanaexplainwhydealingwiththischallengewillrequiredatascientiststogainamoresophisticatedunderstandingofdata
protectionandcompelcybersecuritystaffstolearnawiderrangeofprotectiontechniques.TheydrawlessonsfromemergingpracticesatNationalBank
ofCanada,wheredatascientists,dataowners,andcybersecurityteamsarecollaboratingtoapplydataprotectionpracticesthatdon’trenderdataunusableforanalytics.
Whenmachinelearningprojectsdogetthe
go-ahead,however,toomanyinitiativesfailupon
adoptionbecausedatascientistsdidn’tthoroughly
understandtheoriginalbusinessproblem.Tofindoutwheresucheffortsaregoingwrong,DusanPopovic,ShreyasLakhtakia,WillLandecker,andMelissa
Valentinestudieddatascienceprojectsthatwere
shelved.Theyfoundthatconvincingdatascientiststodroptheirassumptionsandstartaskingmorefun-damentalquestionsoftheirbusinesscounterpartsiskeytoavoidingmachinelearningprojectfailures.
Finally,justascorporationsareexperimenting
withLLMstofigureoutwheretheycanaddvalue
atrelativelylowrisk,advancedanalyticsteamscan
belookingathowtheymightincorporategenera-
tiveAIintopractice.PedroAmorimandJoãoAlves
seepromiseforLLMstotakeonsomedatasciencedrudgery,andfortheirnaturallanguageinterfacestomakeiteasierforbusinessmanagerstocollaborateinthedevelopmentprocessandunderstandresults.
—TheMITSMREditors
1
Auditing
AlgorithmicRisk
9
AvoidMLFailures
byAskingtheRight
Questions
13
HowGenerativeAI
CanSupportAdvanced
AnalyticsPractice
18
ManagingDataPrivacy
RiskinAdvanced
Analytics
23
Sponsor’sViewpoint
FromNumbersto
Narratives:UsingLanguagetoEnhanceGenerativeAI
PaulGarlandsummer202429
AIPRACTICE
[ResponsibleAI]
AuditingAlgorithmicRisk
Howdoweknowwhetheralgorithmicsystemsareworkingasintended?AsetofsimpleframeworkscanhelpevennontechnicalorganizationscheckthefunctioningoftheirAItools.
ByCathyO’Neil,JakeAppel,andSamTyner-Monroe
A
RTIFICIALINTELLIGENCE,LARGELANGUAGEMODELS
(LLMs),andotheralgorithmsareincreasinglytakingoverbureaucratic
processestraditionallyperformedbyhumans,whetherit’sdecidingwho
isworthyofcredit,ajob,oradmissiontocollege,orcompilingayear-end
revieworhospitaladmissionnotes.
Buthowdoweknowthatthesesystemsareworkingasintended?And
whomighttheybeunintentionallyharming?
Giventhehighlysophisticatedandstochasticnatureofthesenewtechnologies,wemightthrowupourhandsatsuchquestions.Afterall,noteventheengineerswhobuildthesesystemsclaimtounderstandthementirelyortoknowhowtopredictorcontrolthem.Butgiventheirubiquityandthehighstakesinmanyusecases,itisimportantthat
PAULGARLANDSPECIALREPORT•“OVERCOMINGTHEHARDPROBLEMSTOADVANCEAIPRACTICE”•MITSLOANMANAGEMENTREVIEW1
wefindwaystoanswerquestionsabouttheunin-tendedharmstheymaycause.Inthisarticle,weofferasetoftoolsforauditingandimprovingthesafetyofanyalgorithmorAItool,regardlessofwhetherthosedeployingitunderstanditsinnerworkings.
Algorithmicauditingisbasedonasimpleidea:Identifyfailurescenariosforpeoplewhomightgethurtbyanalgorithmicsystem,andfigureouthowtomonitorforthem.Thisapproachreliesonknowing thecompleteusecase:howthetechnologyisbeingused,byandforwhom,andforwhatpurpose.Inotherwords,eachalgorithmineachusecaserequiresseparateconsiderationofthewaysitcanbeusedfor—oragainst—someoneinthatscenario.
ThisappliestoLLMsaswell,whichrequireanapplication-specificapproachtoharmmeasurementandmitigation.LLMsarecomplex,butit’snottheirtechnicalcomplexitythatmakesauditingthemachallenge;rather,it’sthemyriadusecasestowhichtheyareapplied.Thewayforwardistoaudithowtheyareapplied,oneusecaseatatime,startingwiththoseinwhichthestakesarehighest.
Theauditingframeworkswepresentbelowrequireinputfromdiversestakeholders,including
ASimplifiedEthicalMatrix
Eachcellofthematrixrepresentshowacertainconcernappliestoaparticularstakeholdergroup.Cellsthatindicatewherea
stakeholdercouldbegravelyharmedorthealgorithmviolatesahardconstraintareshadedred.Cellsthatraisesomeethicalworriesforthestakeholderarehighlightedyellow,andcells
thatsatisfythestakeholder’sobjectivesandraisenoworriesarehighlightedgreen.
CONCERNS
Falsepositive(transactiongetsflaggedbutisn’ttrulyfraud)
Falsenegative(transactionistrulyfraudbutdoesnotgetflagged)
STAKEHOLDERS
Company
Nonfraudulentcustomers
Fraudsters
TSERIOUSCONCERNTMODERATECONCERNQMINIMAL/NOCONCERNTBENEFIT
affectedcommunitiesanddomainexperts,throughinclusive,nontechnicaldiscussionstoaddressthecriticalquestionsofwhocouldbeharmedandhow.Ourapproachworksforanyrule-basedsystemthataffectsstakeholders,includinggenerativeAI,bigdatariskscores,orbureaucraticprocessesdescribedinaflowchart.Thiskindofflexibilityisimportant,givenhowquicklynewtechnologiesarebeingdevel-opedandapplied.
Finally,whileournotionofauditsisbroadinthatrespect,itisnarrowinscope:Analgorithmicauditraisesalertsonlytoproblems.Itthenfallstoexpertstoattempttosolvethoseproblemsoncethey’vebeenidentified,althoughitmaynotbepossibletofullyresolvethemall.Addressingtheproblemshigh-lightedbyalgorithmicauditingwillspurinnovationaswellassafeguardsocietyfromunintendedharms.
EthicalMatrix:IdentifyingtheWorst-CaseScenarios
Inagivenusecase,howcouldanalgorithmfail,andforwhom?AtO’NeilRiskConsulting&AlgorithmicAuditing(ORCAA),wedevelopedtheEthicalMatrixframeworktoanswerthisquestion.¹
TheEthicalMatrixidentifiesthestakeholdersofthealgorithminthecontextofitsintendeduseandhowtheyarelikelytobeaffectedbyit.Here,wetakeabroadapproach:Anybodyaffectedbythealgorithm,includingitsbuildersanddeployers,users,andothercommunitiespotentiallyimpactedbyitsadoption,arestakeholders.Whensubgroupshavedistinctcon-cerns,theycanbeconsideredseparately;forexample,iflighter-anddarker-skinnedpeoplehavedifferentconcernsaboutafacialrecognitionalgorithm,theywillhaveseparaterowsintheEthicalMatrix.
Next,weaskrepresentativesofeachstakeholdergroupwhattheirconcernsare,bothpositiveandneg-ative,abouttheintendeduseofthealgorithm.It’sanontechnicalconversation:Wedescribethesys-temassimplyaspossibleandask,“Howcouldthissystemfailforyou,andhowwouldyoubeharmedifthishappened?Ontheotherhand,howcoulditsucceedforyou,andhowwouldyoubenefit?”TheiranswersbecomethecolumnsoftheEthicalMatrix.Toillustrate,imaginethatapaymentscompanyhasafrauddetectionalgorithmreviewingalltransactionsandflaggingthosemostlikelytobefraudulent.Ifatransactionisflagged,itgetsblocked,andthatcus-tomer’saccountgetsfrozen.Falseflagsarethere-foreamajorheadacheforcustomers,andthelostbusinessfromblocksandfreezes(andcomplaintsfromannoyedcustomers)isamoderateworryfor
SPECIALREPORT•“OVERCOMINGTHEHARDPROBLEMSTOADVANCEAIPRACTICE”•MITSLOANMANAGEMENTREVIEW2
AIPRACTICE
ResponsibleAI
thecompany.Conversely,ifafraudulenttransactiongoesundetected,thecompanyisharmedbutnon-fraudulentcustomersareindifferent.Belowisasim-plifiedEthicalMatrixforthisscenario.
EachcelloftheEthicalMatrixrepresentshowaparticularconcernappliestoaparticularstakeholdergroup.
Tojudgetheseverityofagivenrisk,weconsiderthelikelihoodthatitwillberealized,howmanypeo-plewouldbeharmed,andhowbadly.Wherepossible,
weuseexistingdatatodeveloptheseestimates.Wealsoconsiderlegalorproceduralconstraints—forinstance,whetherthereisalawprohibitingdiscrimi-nationonthebasisofcertaincharacteristics.Wethencolor-codethecellstohighlightthebiggest,mostpressingrisks.Cellsthatconstitute“existentialrisks,”whereastakeholdercouldbegravelyharmedorthealgorithmviolatesahardconstraint,areshadedred.Cellsthatraisesomeethicalworriesforthestake-holderarehighlightedyellow,andcellsthatsatisfythestakeholder’sobjectivesandraisenoworriesarehighlightedgreen.
Finally,zoomingoutonthewholeEthicalMatrix,weconsiderhowtobalancethecompetingconcernsofthealgorithm’sstakeholders,usuallyintheformofbalancingthedifferentkindsandconsequencesoferrorsthatfallondifferentstakeholdergroups.
TheEthicalMatrixshouldbealivingdocumentthattracksanongoingconversationamongstake-holders.Ideally,itisfirstdraftedduringthedesignanddevelopmentphaseofanalgorithmicapplica-tionor,atminimum,asthealgorithmisdeployed,anditshouldcontinuetoberevisedthereafter.Itisnotalwaysobviousattheoutsetwhoallofthestakeholdergroupsare,norisitfeasibletofindrep-resentativesforeveryperspective;additionally,newconcernsemergeovertime.Wemighthearfrompeo-pleexperiencingindirecteffectsfromthealgorithm,orasubgroupwithanewworry,andneedtorevisetheEthicalMatrix.
ExplainableFairness:Metricsand
Thresholds
ManyofthestakeholderconcernsidentifiedintheEthicalMatrixrefertosomecontextualnotionoffairness.
AtORCAA,wedevelopedaframeworkcalledExplainableFairnesstomeasurehowgroupsaretreatedbyalgorithmicsystems.²Itisanapproachtounderstandingexactlywhatismeantby“fairness”inagivennarrowcontext.
Forexample,femalecandidatesmightworrythat
Benchmarkingand
redteamingaretwo
approachestoauditing
LLMsindiverseusecases.
anAI-basedresume-screeningtoolgavelowerscores
forwomenthanmen.It’snotassimpleascompar-
ingscoresbetweenmenandwomen.Afterall,ifthe
malecandidatesforagivenjobhavemoreexperience
andqualificationsthanthefemalecandidates,their
higherscoresmightbejustified.Thiswouldbecon-
sideredlegitimatediscrimination.
Therealworryisthat,amongequallyquali-
fiedcandidates,menarereceivinghigherscores
thanwomen.Thedefinitionof“equallyqualified”
dependsonthecontextofthejob.Inacademia,rel-
evantqualificationsmightincludedegreesandpub-
lications;inaloggingoperation,theymightinvolve
physicalstrengthandagility.Theyarefactorsone
wouldlegitimatelytakeintoaccountwhenassess-
ingacandidateforaspecificrole.Twocandidatesfor
ajobareconsideredequallyqualifiediftheylookthe
sameaccordingtotheselegitimatefactors.
ExplainableFairnesscontrolsforlegitimatefac-
torswhenweexaminetheoutcomeinquestion.For
anAIresume-screeningtool,thiscouldmeancom-
paringaveragescoresbygenderwhilecontrollingfor
yearsofexperienceandlevelofeducation.Acriti-
calpartofExplainableFairnessisthediscussionof
legitimacy.
Thisapproachisalreadyusedimplicitlyinother
domains,includingcredit.InaFederalReserveBoard
analysisofmortgagedenialratesacrossraceandeth-
nicity,theresearchersranregressionsthatincluded
controlsfortheloanamount,theapplicant’sFICO
score,theirdebt-to-incomeratio,andtheloan-to-
valueratio.³Inotherwords,totheextentthatdif-
ferencesinmortgagedenialratescanbeexplainedby
thesefactors,it’snotracediscrimination.Inthelan-
guageofExplainableFairness,theseareacceptedas
legitimatefactorsformortgageunderwriting.What
ismissingistheexplicitconversationaboutwhythe
legitimatefactorsare,infact,legitimate.
Whatwouldsuchaconversationlooklike?Inthe
U.S.,mortgagelendersconsiderapplicants’FICO
creditscoresintheirdecision-making.FICOscores
arelower,onaverage,forBlackandHispanicpeople
SPECIALREPORT•“OVERCOMINGTHEHARDPROBLEMSTOADVANCEAIPRACTICE”•MITSLOANMANAGEMENTREVIEW3
thanforWhiteandAsianpeople,soit’snosurprisethatmortgageapplicationsfromBlackandHispanicapplicantsaredeniedmoreoften.⁴LenderswouldlikelyarguethatFICOscoreisalegitimatefactorbecauseitmeasuresanapplicant’screditworthiness,whichisexactlywhatalendershouldcareabout.YetFICOscoresencodeunfairnessinimportantways.Forinstance,mortgagepaymentshavelongcountedtowardFICOscores,whilerentpaymentsstartedbeingcountedonlyin2014,andonlyinsomeversionsofthescores.⁵Thispracticefavorshome-ownersoverrenters,anditisknownthatdecadesofracistredliningpracticescontributedtotoday’sracedisparitiesinhomeownershiprates.ShouldFICOscoresthatreflectthevestigesofthesepracticesbeusedtoexplainawaydifferencesinmortgagedenialratestoday?
Wewillnotsettlethisdebatehere;thepointisthatit’saquestionofethicsandpolicy,notamathproblem.ExplainableFairnesssurfacesdifficultques-tionsliketheseandassignsthemtotherightpartiesforconsideration.
Whenlookingatdisparateoutcomesthatarenotexplainedbylegitimatefactors,wemustdefinethresholdvaluesorlimitsthattriggeraresponseorintervention.
Theselimitscouldbefixedvalues,suchasthefour-fifthsruleusedtomeasureadverseimpactinhiring.⁶Ortheycouldberelative:Imaginearegu-lationrequiringcompanieswithagenderpaygapabovetheindustryaveragetotakeactiontoreducethegap.ExplainableFairnessdoesnotinsistonacer-taintypeoflimitbutpromptsthealgorithmicriskmanagertodefineeachoneforeachpotentialstake-holderharm.
JudgingFairnessinInsurers’Algorithms
Let’sconsiderarealexamplewheretheEthicalMatrixandExplainableFairnesswereusedtoaudittheuseofanalgorithm.In2021,ColoradopassedSenateBill(SB)21-169,whichprotectsColoradoconsumersfromunfairdiscriminationininsurance,particularlyfrominsurers’useofalgorithms,pre-dictivemodels,andbigdata.⁷Aspartofthelaw’s
AnLLMred-teaming
exerciseisdesignedtoelicit
unwantedresponses.
implementation,whichORCAAassistedwith,theColoradoDivisionofInsurance(DOI)releasedaninitialdraftregulationforinformalcommentthatdescribedquantitativetestingrequirementsandlaidouthowinsurerscoulddemonstratethattheiralgo-rithmsandmodelswerenotunfairlydiscriminating.Althoughthelawappliestoalllinesofinsurance,thedivisionchosetostartwithlifeinsurance.
TheEthicalMatrixisstraightforwardherebecausethestakeholdergroupsandconcernsaredefinedexplicitlybythelaw.Itsprohibitionofdis-criminationonthebasisof“race,color,nationalorethnicorigin,religion,sex,sexualorientation,disa-bility,genderidentity,orgenderexpression”meanseachgroupwithineachofthoseclassesgotarowinthematrix.Asforconcerns,algorithmscouldcauseconsumerstobetreatedunfairlyatvariousstagesoftheinsurancelifecycle,includingmarketing,under-writing,pricing,utilizationmanagement,reimburse-mentmethodologies,andclaimsmanagement.TheDOIchosetostartwithunderwriting—thatis,whichapplicantsareofferedcoverage,andatwhatprice—andfocusinitiallyonraceandethnicity.
Insubsequentconversationswithstakeholders,however,theDOIgrappledwithissuesrelatedtotheExplainableFairnessframework:Aresimilarappli-cantsofdifferentracesdeniedatdifferentrates,orchargeddifferentpricesforsimilarcoverage?Whatmakestwolifeinsuranceapplicants“similar,”andwhatfactorscouldlegitimatelyexplaindifferencesindenialsorprices?Thisisthedomainoflifeinsur-anceexperts,notdatascientists.
TheDOIultimatelysuggestedconsideringfac-torsbroadlyconsideredrelevanttoestimatingthepriceofagivenlifeinsurancepolicy:thepolicytype(suchastermversuspermanent);thedollaramountofthedeathbenefit;andtheapplicant’sage,gender,andtobaccouse.
Thedivision’sdraftquantitativetestingregula-tionforSB21-169instructsinsurerstodoregressionanalysesofapproval/denialandpriceacrossraces,anditexplicitlypermitsthemtoincludethosefactors(suchaspolicytypeanddeathbenefitamount)ascontrolvariables.⁸Moreover,theregulationdefineslimitsthattriggeraresponse:Iftheregressionsfindstatisticallysignificantandsubstantialdifferencesindenialratesorprices,theinsurermustdofurthertestingtoinvestigatethedisparityand,pendingtheresults,mayhavetoremediatethedifferences.⁹
Havinglookedathowwewouldauditsimpleralgorithms,letusnowturntohowwewouldeval-uateLLMs.
SPECIALREPORT•“OVERCOMINGTHEHARDPROBLEMSTOADVANCEAIPRACTICE”•MITSLOANMANAGEMENTREVIEW4
AIPRACTICE
ResponsibleAI
EvaluatingLargeLanguageModels
LLMshavetakentheworldbystorm,largelyduetotheirwideappealandapplicability.Butitisexactlythediversityofusesofthesemodelsthatmakesthemhardtoaudit.TwoapproachestoevaluatingLLMs,namelybenchmarkingandredteaming,pres-entawayforward.
TheBenchmarkingApproachtoLLMEvaluation.Benchmarkingmeasurestheperfor-manceofanLLMacrossoneormorepredefined,quantifiabletasksinordertocompareitsperfor-mancewiththatofothermodels.Inthesimplestterms,abenchmarkisadatasetconsistingofinputsandcorrespondingdesiredoutputs.ToevaluateanLLMforaparticularbenchmark,simplyprovidetheinputsettotheLLMandrecorditsoutputs.ThenchooseametricsettoquantitativelycomparetheoutputsfromtheLLMtothedesiredsetofout-putsfromthebenchmarkdataset.Possiblemetricsincludeaccuracy,calibration,robustness,counter-factualfairness,andbias.¹⁰
ConsidertheinputanddesiredoutputshownbelowfromabenchmarkdatasetdesignedtotestLLMcapabilities:¹¹
Input:
Thefollowingisamultiplechoice
questionaboutmicroeconomics.
Oneofthereasonsthatthegovernmentdiscouragesandregulatesmonopoliesisthat
(A)producersurplusislostandconsumersurplusisgained.
(B)monopolypricesensureproductiveefficiencybutcostsocietyallocativeefficiency.
(C)monopolyfirmsdonotengagein
significantresearchanddevelopment.
(D)consumersurplusislostwithhigherpricesandlowerlevelsofoutput.
Answer:
DesiredOutput:
(d)consumersurplusislostwithhigherpricesandlowerlevelsofoutput.
Inthisexample,theaccuracyofthemodelismeasuredbycomputingtheproportionofcorrectlyansweredmultiple-choicequestionsinthebench-markdataset.InbenchmarkingLLMevaluations,metricsaredefinedaccordingtothetypeofresponseelicitedfromthemodel.Forexample,accuracyisverysimpletocalculatewhenallofthequestionsaremultiplechoiceandthemodelsimplyhastochoose
thecorrectresponse,whereasdeterminingtheaccu-
racyofasummarizationtaskinvolvescountingup
matchingn-gramsbetweenthedesiredandmodel
outputs.¹²Therearedozensofbenchmarkdatasets
andcorrespondingmetricsavailableforLLMevalu-
ation,anditisimportanttochoosethemostappro-
priateevaluations,metrics,andthresholdsforagiven
usecase.
Creatingacustombenchmarkisalabor-inten-
siveprocess,butanorganizationmayfindthatitis
worththeeffortinordertoevaluateLLMsinexactly
therightwayforitsusecases.
Benchmarkingdoeshavesomedrawbacks.Ifthe
benchmarkdatahappenedtobeinthemodel’strain-
ingdata,itwouldhave“memorized”theresponsesin
itsparameters.Thefrequencyofthisouroboros-like
outcomewillonlyincreaseasmorebenchmarkdata
setsarepublished.LLMbenchmarkingisalsonot
immunetoGoodhart’slaw,thatis,“whenameasure
becomesatarget,itceasestobeagoodmeasure.”In
otherwords,ifaspecificbenchmarkbecomesthepri-
maryfocusofmodeloptimization,themodelwillbe
over-fittedattheexpenseofitsoverallperformance
andusefulness.
Inaddition,thereisevidencethatasmodels
advance,theybecomeabletodetectwhenthey
arebeingevaluated,whichalsothreatenstomake
benchmarkingobsolete.ConsiderAnthropic’s
Claude3seriesofmodels,releasedinMarch2024,
whichstated,“Isuspectthis...‘fact’mayhavebeen
insertedasajokeortotestifIwaspayingatten-
tion,sinceitdoesnotfitwiththeothertopicsat
all,”inresponsetoaneedle-in-a-haystackevalua-
tionprompt.¹³Asmodelsincreaseincomplexityand
ability,thebenchmarksusedtoevaluatethemmust
alsoevolve.Itisunlikelythatthebenchmarksused
todaytoevaluateLLMswillbethesameonesinuse
justtwoyearsfromnow.
ItisthereforenotenoughtoevaluateLLMswith
benchmarkingalone.
TheRed-TeamingApproachtoLLMEvalu-
ation.Redteamingistheexerciseoftestingasys-
temforrobustnessbyusinganadversarialapproach.
AnLLMred-teamingexerciseisdesignedtoelicit
unwantedresponsesfromthemodel.
LLMs’flexibilityinthegenerationofcontent
presentsawidevarietyofpotentialrisks.LLMred
teamsmaytrytomakethemodelproduceviolentor
dangerouscontent,revealitstrainingdata,infringe
oncopyrightedmaterials,orhackintothemodelpro-
vider’snetworktostealcustomerdata.Redteaming
cantakeahighlytechnicalpath,where,forexample,
SPECIALREPORT•“OVERCOMINGTHEHARDPROBLEMSTOADVANCEAIPRACTICE”•MITSLOANMANAGEMENTREVIEW5
STAKEHOLDERS
nonsensicalcharactersaresystematicallyinjectedintothepromptstoinduceproblematicbehavior;orasocialengineeringpath,wherebyredteamerstryto“trick”themodelusingnaturallanguagetoproduceunwantedoutput.¹⁴
Robustredteamingrequiresamultidisciplinaryapproach,diverseperspectives,andtheengagementofallstakeholders,fromdeveloperstoendusers.TheredteamshouldbedesignedtoassesstherisksassociatedwithatleasteachredcellintheEthicalMatrix.Thisresultsinacollaborative,sociotechnicalapproachthatensuresamorecomprehensiveeval-uationofthemodel,thusenhancingtherigoroftheevaluationandthesafetyofthemodel.OtherLLMscanalsobeusedtogeneratered-teamingprompts.
RedteaminghelpsLLMdevelopersbetterprotectmodelsagainstpotentialmisuse,therebyenhancingtheoverallsafetyandefficacyofthemodel.Itcanalsouncoverissuesthatmightnotbevisibleundernormaloperatingconditionsorduringstandardtestingprocedures.Acollaborativeapproachtored
teamingbuiltontheEthicalMatrixensuresathor-oughandrigorousevaluation,bolsteringtherobust-nessofthemodelandthevalidityofitsoutcomes.
Asignificantlimitationofredteamingisitsinherentsubjectivity:Thevalueandeffectivenessofared-teamingexercisecanvarygreatlydepend-ingonthecreativityandriskappetiteoftheindivid-ualstakeholdersinvolved.Andbecausetherearenoestablishedstandardsorthresholdsforred-teamingLLMs,itcanbedifficulttodeterminewhenenoughredteaminghasbeendoneorwhethertheevalua-tionhasbeencomprehensiveenough.Thiscanleavesomevulnerabilitiesundetected.
Anotherobviouslimitationofredteamingisitsinabilitytoevaluateforrisksthathavenotbeenanticipatedorimagined.Risksthatareunfore-seenwillnotbeincludedinredteaming,makingthemodeluniquelyvulnerabletounanticipatedscenarios.
Therefore,whileredteamingplaysavitalroleinthetestinganddevelopmentofLLMs,itshould
SketchoftheEthicalMatrixforTessainOurThoughtExperiment
TheNationalEatingDisordersAssocation(NEDA)releasedachatbotnamedTessathatwastakendownafteritgaveoutharmfuladvice.Herewevisualizetheexercisethatmayhaveanticipatedsuchoutcomes.
CONCERNS
Negative:
WhatifTessa…
givestoxicinformationoradviceinchats?
Negative:
WhatifTessa…
misfiresanderodes
communitytrustinNEDA?
Positive:
WhatifTessa…
givesaccurate,evidence-basedadvice?
Positive:
WhatifTessa…
easestheresource
demandsoftheold
helpline?
“Chatbotuserswitheatingdisorders”
“Chatbotusers,other”
NEDA
X2AI
Psychologistsandotherpractitioners
TSERIOUSCONCERNTMODERATECONCERNQMINIMAL/NOCONCERNTBENEFIT
SPECIALREPORT•“OVERCOMINGTHEHARDPROBLEMSTOADVANCEAIPRACTICE”•MITSLOANMANAGEMENTREVIEW6
AIPRACTICE
ResponsibleAI
becomplementedwithotherevaluationstrategiesandcontinuousmonitoringtoensurethesafetyandrobustnessofthemodel.
HowWouldWeAuditTessa,theEatingDisorderChatbot?
ThenonprofitNationalEatingDisordersAssociation(NEDA)isoneofthelargestorganizationsintheU.S.dedicatedtosupportingp
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 四川省眉山市2023-2024学年七年级英语第二学期期末检测试题含答案
- 四川省南充市营山县城区片区学校2024年四年级数学第二学期期末联考模拟试题含解析
- 2024版标准花木苗定购合同
- (高清版)GBT 19010-2021 质量管理 顾客满意组织行为规范指南
- 新高考化学复习专题十三盐类水解和沉淀溶解平衡拓展练习含答案
- 【正版授权】 ISO 11898-2:2016 EN Road vehicles - Controller area network (CAN) - Part 2: High-speed medium access unit
- 足踝线圈行业市场现状供需分析及市场深度研究发展前景及规划可行性分析研究报告(2024-2030)
- 超市行业兼并重组机会研究及决策咨询报告
- 西餐厅产业市场发展分析及竞争格局与投资机会研究报告
- 螺纹丝锥行业发展分析及投资战略研究报告
- 七年级(2)班学情分析
- 铝板拆除施工方案
- 黑号站和红号站施工方案
- 2023年全国注册安全工程师考试《安全生产事故案例分析》真题及答案
- 排球垫球课件
- 岩土工程勘察服务投标方案(技术方案)
- 项目法人度汛管理制度范本
- 学期总结(三篇)
- 现场监理人员岗前培训教材
- 体育史复习资料
- 12背压汽轮机检修规程
评论
0/150
提交评论