sjtul matlab8 v02生物信息学第八课_第1页
sjtul matlab8 v02生物信息学第八课_第2页
sjtul matlab8 v02生物信息学第八课_第3页
sjtul matlab8 v02生物信息学第八课_第4页
sjtul matlab8 v02生物信息学第八课_第5页
已阅读5页,还剩76页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

LifeScienceNecessities:FlexibilityandDataGathering–BreadthandEasilyloadcommonfileExcel,CSVandotherImage(jpeg,tiff,gif,bmp,png,AccesstomanyspecializedSequencedata(fasta,embl,genbank,etc.)Microarray(Affymetrix,GenePix,GEO,BLASTReports,MassSpec,PhylogeneticTrees,CompleteintegrationtoSQLandODBCDirectAccesstoExternalVideoCameras,MedicalEquipment,Example:SeamlessDatabaseVisualQueryAccessdatawithoutknowingScrollthroughtablesandCustomizeyourBuilt-invisualizationPlottingandCreatingHMTLHandlingdateReuseSQLstatementsinyourownProblemswithinsufficientlyautomatedcomputationalLackofInadequatemetricsforquantification,Slow,Humanerror,transcriptionLimitedscientificPerformaspectrumofanalysesincludingnonlinearmixed-effects(非线性混合效应),sequence(测序),microarray(微阵列),phylogenetictree(系统进化树),massspectrometry(质谱分析),andgeneontology(基因本体论)Importdatafrommultiplesources,suchasdatabases,fileformats,orShareresultswithautomaticallygeneratedHTMLreports,datavisualizations,orstand-alonetoolsParallelizedataanalysistodecreasecomputationAutomateanalysestoimplementbatchprocessingofcontiguousExploreProductsforComputationalDataAcquisitionTheadvantageofautomatedcomputationalObtainobjectiveReducecostsandDecreaseprocessingandanalysisAlleviatehumanerrorsandtranscriptionConsiderthisimagefromNationalCancerGoal:ToquantifytheamountofInitialmethod:Post-docsitsbehindmicroscopeandcountsthenumberofmetastaticspotsnottootimeconsumingforoneimage

NotaveryconvincingGoal:ToquantifytheamountoftissuemetastasisforInitialmethod:Post-docsitsbehindmicroscopeandcountsthenumberofmetastaticspotsHowautomatedcomputingObtainobjectiveReducecostsandDecreaseprocessingandanalysisAlleviatehumanerrorsandtranscriptionTectorialMembraneGoal:DetermineelasticityofTectorialMembraneAtomicForceMicroscopeInitialmethodtoAtomicForceMicroscopeAnalysisof1AFMfiletook30-40Arealisticgoalwastoanalyze10filesinoneWithautomatedcomputing,theobtainableamountofdataincreasedAnalysisof1AFMnowtook3-4Nowwecouldanalyze100soffilesinaportionofaAnalysisofFluoresceinGoalDeterminemeancirculationtime平均循环时间(MCT)andretinalbloodflow视网膜血流Intensity, Intensity,Fit ensity-vs-TimetolognormalparameterizedbyIo,Ip,tp,b(shapeMCTMCT=tm,vein-RBF=2art+ AnalysisofFluorescein =t-t)exp3 Manuallytrackvessels,collectingtime-intensitydata(40minutesinadarkroom!)Manuallyidentifyarteries,TransferintensityinformationtostatisticspackagetocalculatefitparametersDetermineManuallymeasurevesselCalculateLogresultsinlabPerfectapplicationforneuralnetworksAutomatedtheanalysiswithMATLABandCodecurrentlyusedinlabsLet’stakeaGoal:Determinemeancirculationtime(MCT)andretinalbloodflowPreviousTimeVeryAutomatedcomputingallowedusObtainobjectiveDecreaseprocessingandanalysisReducecostsandTypicalAccessAnalyzeShareSimBiology,Systems

SimBiology®providesanappandprogrammatictoolstomodel,simulate,andanalyzedynamicsystems,focusingonpharmacokinetic/pharmacodynamic(PK/PD)andsystemsbiologyapplications.Itprovidesablockdiagrameditorforbuildingmodels,oryoucancreatemodelsprogrammaticallyusingtheMATLAB®language.SimBiologyincludesalibraryofcommonPKmodels,whichyoucancustomizeandintegratewithmechanisticsystemsbiologymodels.Avarietyofmodelexplorationtechniquesletyouidentifyoptimaldosingschedulesandputativedrugtargetsincellularpathways.SimBiologyusesordinarydifferentialequations(ODEs)andstochasticsolverstosimulatethetimecourseprofileofdrugexposure,drugefficacy,andenzymeandmetabolitelevels.Youcaninvestigatesystemdynamicsandguideexperimentationusingparametersweepsandsensitivityanalysis.YoucanalsousesinglesubjectorpopulationdatatoestimatemodelSimBiologyUserinterfacetofacilitatebuilding,simulating,andanalyzingdynamicImport,build,andexportmechanisticorPKPDrepresentationofsystemSimulateresponsestobiologicalvariabilityordifferentdosingconditions,scanparameterranges,calculatesensitivitiesLeast-squaresestimationofgroupedorpooleddata,andmaximumlikelihoodestimationofpopulationparametersDeploySimBiologymodelsforstandaloneQuestionstoWhatisthevalueofmodelingQuestionstoWhyCreateQuantitativeBiochemicalReactionBiochemicalpathwaysstartoutsimpleandquicklygrowinTestingpathwaysviaexperimentisexpensiveinbothtimeandmoney.QuantitativemodelingnarrowstherangeofOncecreatedandvalidatedwithexperimentsthequantitativemodelcanbeusedasanin-silicosandboxtotestnewideasdramaticallyfasterthanthroughexperimentation.ChallengeswithincomputingbiochemicalIntegratingknowledgefromexperimentaldata,intuition,literature,andothermodelsisdifficultModelersandscientistshavedifficultycommunicatingknowledgeandsharingworkThemathematicsforsolvingthesemodelsisevolvingfasterthanthetoolsManydifferenttoolsareneededtocompleteentireworkflowModelcreatedbyEnterinchemicalEstimateparametersusingexperimentaldataIsolaterelevantparametersusingsensitivityanalysis>>IntroductiontoProvidesoneenvironmentforbothgraphicalandprogrammaticIntroductiontoProvidesonetoolformodeling,simulating,andanalyzingpathwaysUsedbymodelersorprogrammerstogaininsightintotheirpathwayandtocommunicatetheirpathwaywithKeyBuildingaTabularViaMATLABImportSBMLRunningaAnalyzingaSensitivityLet’sLet’sbuildasimpleAsimplegeneregulationmodelwithtranslation,andnegativefeedbacktosuppressLet’sbuildasimpleTranscription:theprocessthroughwhichaDNAsequenceisenzymaticallycopiedbyanRNApolymerase聚合酶toproduceacomplementaryRNA;thetransferofgeneticinformationfromDNAintoRNA.Translation:thesecondpartofproteinbiosynthesis生物合成,inwhichanmRNAsequenceisconvertedtoachainofaminoacidstoformaprotein.

>>>>Pharmacokinetics.Thestudyofwhatthebodydoestoadrugafteradministration.是指抗生ThestudyofAbsorptionDistributionMetabolismandExcretion分泌(ADME)ofdrugsinthebodyPharmacodynamics.Thestudyofwhatthedrugdoestothebody.是指抗生素在感染部位达到相应的浓Thestudyofthebiochemicalandphysiological生理学effectsofdrugsmechanismsofdrugactionrelationshipbetweendrugconcentrationandeffectPROBLEM:Theeffectofadrugiscalculatedfromtheamountinthebiophase,which,unfortunately,cannotbedirectlymeasured.PKknowledgeisneededtomodeltransferofdrugfrombloodtoeffectsiteChallengesinPK/PDManytoolsChallengesinPK/PDNONMEM,Basic,Fortan,C:Buildingandmaintainingmodelscanbedifficult.OrganSpecificornicheSimulationtoolsaretoocomplexand/orblackboxOrganmodelsnoteditable,methodsarenotFlexibilityisWorkflowismanual,notModelling,simulation,statistics,andvisualizationallrequiredifferenttoolsManualintegrationistimePKExampleTransdermalInputNicotinepatchisappliedtotheskinfor16Overlappingzero-orderinputDrugconcentrationmonitoredfor24Singlecompartment

Rapiddecreaseinconcentrationwheninfusionratesdrop==

Totaldose–Doseslow

dC/dt=(FfastdC/dt=(Ffast+Fslow–

NoPKExample…PKExample…1234568FastinfusionrunsfortimeSlowinfusionrunsfortimeInitialnicotineconcentration=2V=140V=140=78 =6=17

GenericPBPKmodelofFromPoulinandThiel;JPharmaceuticalSciences.91:5,MayFromPoulinandThiel;JPharmaceuticalSciences.91:5,MayPKExample–Let’sPKExample–Let’sshowhowwemightimplementthisin>>>>Read,analyze,andvisualizegenomicandproteomicBioinformaticsToolbox™providesalgorithmsandappsforNextGenerationSequencing(NGS),microarrayanalysis,massspectrometry,andgeneontology.Usingtoolboxfunctions,youcanreadgenomicandproteomicdatafromstandardfileformatssuchasSAM,FASTA,CEL,andCDF,aswellasfromonlinedatabasessuchastheNCBIGeneExpressionOmnibusandGenBank®.Youcanexploreandvisualizethisdatawithsequencebrowsers,spatialheatmaps,andclustergrams.Thetoolboxalsoprovidesstatisticaltechniquesfordetectingpeaks,imputingvaluesformissingdata,andselectingBioinformaticsToolbox--KeyNextGenerationSequencinganalysisandSequenceanalysisandvisualization,includingpairwiseandmultiplesequencealignmentandpeakdetectionMicroarraydataanalysis,includingreading,filtering,normalizing,andMassspectrometryanalysis质谱分析includingclassification,andmarkerPhylogenetictreeGraphtheoryfunctions,includinginteractionmaps,hierarchyplots,andpathwaysDataimportfromgenomic,proteomic,andgeneexpressionfiles,includingSAM,FASTA,CEL,andCDF,andfromdatabasessuchasNCBIandGenBankThemicroarraydataforthisexampleisDeRisi,J.L.,Iyer,V.R.,andBrown,P.O.(Oct24,1997).Exploringthemetabolicandgeneticcontrolofgeneexpressiononagenomicscale.Science,278(5338),680–686.PMID:9381177.TheauthorsusedDNAmicroarraystostudytemporalgeneexpressionofalmostallgenesinSaccharomycescerevisiaeduringthemetabolicshiftfromfermentationtorespiration.Expressionlevelsweremeasuredatseventimepointsduringthediauxicshift.ThefulldatasetcanbedownloadedfromtheGeneExpressionOmnibusWebsiteat:1、LoaddataintotheMATLABenvironment.loadyeastdata.mat2、GetthesizeofthedatabyAns=Accesstheentriesusingcellarray%Thisdisplaysthe15throwofthevariableyeastvalues,whichcontainsexpressionlevelsfortheopenreadingframe(ORF)YAL054C.ans=4、UsethefunctionwebtoaccessinformationaboutthisORFintheSaccharomycesGenomeDatabase(SGD).url=5、AsimpleplotcanbeusedtoshowtheexpressionprofileforthisORF(openreadingframe).xlabel('Time(Hours)');6、Plottheactualvalues.plot(times,2.^yeastvalues(15,:))xlabel('Time(Hours)');ylabel('RelativeExpressionLevel');TheMATLABsoftwareplotsthefigure.ThegeneassociatedwiththisORF,appearstobestronglyup-regulatedduringthediauxicshift.7、Compareothergenesbyplottingmultiplelinesonthesamefigure.holdxlabel('Time(Hours)');ylabel('RelativeExpressionLevel');title('ProfileExpressionLevels');TheMATLABsoftwareplotstheThisprocedureillustrateshowtofilterthedatabyremovinggenesthatarenotexpressedordonotchange.Thedatasetisquitelargeandalotoftheinformationcorrespondstogenesthatdonotshowanyinterestingchangesduringtheexperiment.Tomakeiteasiertofindtheinterestinggenes,reducethesizeofthedatasetbyremovinggeneswithexpressionprofilesthatdonotshowanythingofinterest.Thereare6400expressionprofiles.Youcanuseanumberoftechniquestoreducethenumberofexpressionprofilestosomesubsetthatcontainsthemostsignificantgenes. M‘emptySpots=strcmp('EMPTY',genes);yeastvalues(emptySpots,:)=[];genes(emptySpots)=[];2、Usetheisnanfunctiontoidentifythegeneswithmissingdataandthenuseindexingcommandstoremovethegenes.nanIndices=any(isnan(yeastvalues),2);yeastvalues(nanIndices,:)=[];genes(nanIndices)=[];ans3、UsethefunctiongenevarfiltertofilteroutgeneswithsmallvarianceovertimeThefunctionreturnsalogicalarrayofthesamesizeasthevariablegeneswithonescorrespondingtorowsofyeastvalueswithvariancegreaterthanthe10thpercentileandzeroscorrespondingtothosebelowthethreshold.mask=%Usethemaskasanindexintothevaluestoremove%filteredyeastvalues=yeastvalues(mask,:);genes=genes(mask);ans4、Thefunctiongenelowvalfilterremovesgenesthathaveverylowabsoluteexpressionvalues.Notethatthegenefilterfunctionscanalsoautomaticallycalculatethefiltereddataandnames.[mask,yeastvalues,genes]=ans5、Usethefunctiongeneentropyfiltertoremovegeneswhoseprofileshavelowentropy:[mask,yeastvalues,genes]=ans H=−ln(1/30)= uniformNowthatyouhaveamanageablelistofgenes,youcanlookforrelationshipsbetweentheprofilesusingsomedifferentclusteringtechniquesfromtheStatisticsandMachineLearningToolbox™1、Forhierarchicalclusteringthefunctionpdistcalculatesthepairwisedistancesbetweenprofiles,andthefunctionlinkagecreatesthehierarchicalclustertree.corrDist=pdist(yeastvalues,'corr');clusterTree=linkage(corrDist,'average');2、ThefunctionclustercalculatestheclustersbasedoneitheracutoffdistanceoramaximumnumberofclustersInthiscasethe'maxclust'optionisusedtoidentify16distinctclusters.clusters=cluster(clusterTree,'maxclust',3、Theprofilesofthegenesintheseclusterscanbeplottedtogetherusingasimpleloopandthefunctionsubplot.forc=1:16plot(times,yeastvalues((clusters==c),:)');axistightsuptitle('HierarchicalClusteringof4、TheStatisticsandMachineLearningToolboxsoftwarealsohasaK-meansclusteringfunction.Again,16clustersarefound,butbecausethealgorithmisdifferentthesearenotnecessarilythesameclustersasthosefoundbyhierarchicalclustering.forc=

TheMATLABsoftwareiterations,totalsumofdistances=iterations,totalsumofdistances=8.6267426iterations,totalsumofdistances=8.8606622iterations,totalsumofdistances=9.7767626iterations,totalsumofdistances=9.010354、TheStatisticsandMachineLearningToolboxsoftwarealsohasaK-meansclusteringfunction.Again,16clustersarefound,butbecausethealgorithmisdifferentthesearenotnecessarilythesameclustersasthosefoundbyhierarchicalclustering.forc=5、Insteadofplottingalloftheprofiles,youcanplotjusttheforc=1:16axistightaxisoff %turnofftheaxissuptitle('K-MeansClusteringofClustering6、YoucanusethefunctionclustergramtocreateaheatmapanddendrogramfromtheoutputofthehierarchicalClusteringPrincipal-componentanalysis(PCA)isausefultechniqueyoucanusetoreducethedimensionalityoflargedatasets,suchasthosefrommicroarrayanalysis.YoucanalsousePCAtofindsignalsinnoisydata.1、UsethepcafunctionintheStatisticsandMachineLearningToolboxsoftwaretocalculatetheprincipalcomponentsofadataset.[pc,zscores,pcvars]=pca(yeastvalues)

TheMATLABsoftwarepcColumns1through2、Youcanusethefunctioncumsumtoseethecumulativesumofthevariances.cumsum(pcvars./sum(pcvars)*Thisshowsthatalmost90%ofthevarianceisaccountedforbythefirsttwoprincipal

TheMATLABsoftwareans3、Ascatterplotofthescoresofthefirsttwoprincipalcomponentsshowsthattherearetwodistinctregions.Thisisnotunexpected,becausethefilteringprocessremovedmanyofthegeneswithlowvarianceorlowinformation.Thesegeneswouldhaveappearedinthemiddleofthescatterplot.xlabel('FirstPrincipalComponent');ylabel('SecondPrincipalComponent');title('PrincipalComponentScatterPlot');4、ThegnamefunctionfromtheStatisticsandMachineLearningToolboxsoftwarecanbeusedtoidentifygenesonascatterplot.Youcanselectasmanypointsasyoulikeonthescatterplot.5、AnalternativewaytocreateascatterplotiswiththegscatterfunctionfromtheStatisticsandMachineLearningToolboxsoftware.gscattercreatesagroupedscatterplotwherepointsfromeachgrouphaveadifferentcolorormarker.Youcanuseclusterdata,oranyotherclusteringfunction,togroupthepcclusters=clusterdata(zscores(:,1:2),6);xlabel('FirstPrincipalComponent');ylabel('SecondPrincipalComponent');title('PrincipalComponentScatterPlotwithColoredgname(genes)%Pressenterwhenyoufinishselectinggenes.SupportedDataSupportedDataBLAST

GeneExpressionOtherDataDesignofPrimersforAutomatedDNACalculatepropertiesofFilterprimersbasedonGCcontentorCheckfordimerizationandhairpinRetrieveprimerFindrestrictionenzymethatcutinsideIsolateprimerslackingaGC Pos 50cacatagcccttgccataag11375054.37AppliedBiosystemsDevelopsaCrucialDNASequencingAlgorithminMATLAB®TheTodeveloparobustyetflexiblecalibrationalgorithmtobeincludedinahigh-throughputDNAanalysisinstrumentTheUseMATLABtotestideasandcodeaprototype,andthenusetheMATLAB

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论