加州大学伯克利分校数据科学专业设置_第1页
加州大学伯克利分校数据科学专业设置_第2页
加州大学伯克利分校数据科学专业设置_第3页
加州大学伯克利分校数据科学专业设置_第4页
加州大学伯克利分校数据科学专业设置_第5页
已阅读5页,还剩6页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

1、数据科学硕士专业设置俞梦怡 专业(方向)名称:Data Science学位名称: professional Master of Information and Data Science (MIDS) 信息和数据科学专业硕士级别:master 硕士所属院系:The UC Berkeley School of Information (I school) 信息学院所属学校:加州大学伯克利分校网址:/专业介绍:Designed by I School faculty, our curriculum is multidisciplinary.

2、 You will bring together a range of methods to define a research question; to gather, store, retrieve, and analyze data; to interpret results; and to convey findings effectively. Using the latest tools and practices, you will identify patterns in and gain insights from complex data sets. 由信息学院的教师设计,

3、课程是多学科的。你将使用一系列方法来定义一个研究问题:去收集、存储、检索和分析数据,去解释结果并有效地传达发现。采用最新的工具和实践,你会识别模式,并从复杂的数据集中获得见解。专业培养目标:train leaders in the ever-evolving field of data science 培养在数据科学领域的领导人专业培养方案:The program focuses on problem solving, preparing you to creatively apply methods of data collection, analysis, and presentation

4、 to solve the worlds most challenging problems. 侧重于问题解决,帮助你准备创造性地运用数据的收集、分析和图像的方法来解决世界上最具挑战性的问题。学生背景要求: 1. A bachelors degree 学士学位 2. Test scores 考试成绩(GRE/GMAT/TOEFL) 3. A high level of quantitative ability 高层次的定量能力 4. A problem-solving mindset 解决问题的思维方式 5. A working knowledge of fundamental concept

5、s基本概念的应用知识 6. The ability to communicate effectively 有效的沟通能力 7. Programming proficiency 编程能力学分:27学分(九门课)完成时间:5 terms,20 months 五个学期,20个月授课方式:The UC Berkeley School of Informations Master of Information and Data Science (MIDS) is a web-based program featuring immersive coursework and live, online cla

6、sses you can attend from anywhere in the world. Delivered on a state-of-the-art learning platform, datascienceberkeley facilitates collaboration and discussion to help you build a professional network of faculty and peers from the start.Students can access all datascienceberkeley content 24 hours a

7、day, 7 days a week. 加州大学伯克利分校信息学院的信息与数据科学硕士(MIDS)是一个基于网络的项目,这是具有身临其境的课程和直播,你可以在世界任何地方参加网上课程。在国家最先进的学习平台上进行传送,伯克利分校的数据科学有助于协作和讨论,以帮助学生从一开始就建立一个与教师和同行一起的专业网络。 学生可以一周七天,每天24小时访问伯克利分校所有数据科学的内容。课程架构/课程体系:Below is a sample course schedule and the expected path through the degree program. Students who are

8、interested in taking the program on an accelerated basis can complete their coursework in 3 or 4 terms with approval from the School by taking up to 3 courses in one or more terms. 下面是一个示例课程安排,以及通过学位课程的预期路径。有兴趣在加速基础上参加该项目的学生能够在3或4学期完成他们的课程,这需要获得学院批准其在一个或多个学期内完成3门课程。每门课程简介:1. Research Design and Appl

9、ication for Data and Analysis数据和分析研究设计与应用 技能:Research design / Question formulation / Data and decision making / Understanding cognitive bias / Data for persuasion and action / Integrating data and domain knowledge / Storytelling with data 研究设计/问题制定/数据和决策/了解认知偏差/数据进行劝说和行动/数据集成和领域知识/用数据讲故事 课程简介:This

10、course introduces students to the burgeoning data sciences landscape, with a particular focus on learning how to apply data science techniques to uncover, enrich, and answer questions facing industries today. After an introduction to data sciences and an overview of the program, students will explor

11、e how organizations make decisions and the emerging role of big data in guiding both tactical and strategic decisions. Lectures, readings, discussions, and assignments will teach how to apply disciplined, creative methods to ask better questions, gather data, interpret results, and convey findings t

12、o various audiences in ways that change minds and change behaviors. The emphasis throughout is on making practical contributions to real decisions that organizations will and should make. Industries and domains that we will explore include sports management, finance, energy, journalism, intelligence

13、, health care, and media entertainment. 本课程向学生介绍了新兴的数据科学的情况,尤其侧重于学习如何运用数据的科学技术来发现、丰富并回答如今所面临的行业问题。在介绍了数据科学和项目的概况后,学生将探讨企业如何做出决策和大数据在指导战术和战略决策中扮演的新兴角色。讲座、阅读、讨论、作业会教学生如何运用学科和创造性的方法来提出更好的问题,收集数据、解释结果并向大量听众传达调查结果可以改变思想和行为方式。整体的重点是为组织提供切实有效的决策。我们将探讨的行业和领域包括体育管理,金融,能源,新闻,情报,医疗保健和媒体娱乐。2. Exploring and Anal

14、yzing Data 探索和分析数据 技能:Research design / Statistical analysis 研究设计/统计分析 工具:R 课程简介:The goal of this course is to provide students with an introduction to many different types of quantitative research methods and statistical techniques for analyzing data. We begin with a focus on measurement, inferenti

15、al statistics, and causal inference. Then, we will explore a range of statistical techniques and methods using the open-source statistics language, R. We will use many different statistics and techniques for analyzing and viewing data, with a focus on applying this knowledge to real-world data probl

16、ems. Topics in quantitative techniques include: descriptive and inferential statistics, sampling, experimental design, parametric and non-parametric tests of difference, ordinary least squares regression, and logistic regression. 本课程的目的是为学生提供介绍许多不同类型的定量研究方法和分析数据的统计技术。首先侧重于测量、统计推断和因果推断。然后,将探讨一系列使用开源统

17、计语言R的统计技术和方法。我们将使用许多不同的统计和技术来分析和查看数据,重点是将这一知识用于解决现实世界的数据问题。定量技术主题包括:描述和统计推断,取样,实验设计,参数化和差异性的非参数检验,普通最小二乘回归和回归。3. Storing and Retrieving Data存储和检索数据技能:Data acquisition/Data cleaning and normalization/Building data bases / Data classification and indexing / Data warehousing 数据采集/数据清理和规范化/建筑数据库/数据分类和索引

18、/数据仓库工具:Python / Relational databases / Hadoop / Map reduce/ Spark/ Cloud Computing (AWS)课程简介:This course prepares students to deal with large-scale collections of data as objects to be stored, searched over, selected, and transformed for use. We examine both the background theory and practical appl

19、ication of information retrieval, database design and management, data extraction, transformation and loading for data warehouses, and operational applications. We will examine traditional methods of information retrieval and database management as well as new approaches that use massively parallel

20、computation (MapReduce/Hadoop). Through readings, discussion, and hands-on experimentation, students will be prepared to discuss, plan, and implement storage, search and retrieval systems for large-scale structured and unstructured information systems using a variety of software tools. They will als

21、o be able to evaluate large-scale information storage and retrieval systems in terms of both efficiency and effectiveness in providing timely, accurate, and reliable access to needed information. 本课程培养学生处理以大规模集合数据为对象的存储、搜索、选择及转化以供使用。我们研究这一问题的背景理论和信息检索,数据库设计和管理,数据抽取,转换和加载数据仓库的实际应用和业务应用。我们将研究信息检索和数据库管

22、理的传统方法以及使用大规模并行计算(MapReduce/ Hadoop)的新方法。通过阅读、讨论、动手实验,学生将使用多种软件工具为大规模的结构化和非结构化信息系统进行讨论、计划、实施存储、搜索和检索系统。他们也将能够在提供及时、准确、可靠的获得所需要的信息,以评估在效率和有效性方面的大规模信息存储和检索系统。4. Applied Machine Learning 应用机器语言 技能:Experimental design / Working with machine learning algorithms/ Feature engineering/Prediction vs. explana

23、tion/ Network analysis/Collaborative filtering 实验设计/用机器学习算法工作/功能设计/预测与解释/网络分析/协同过滤工具:Python / Python libraries for linear algebra, plotting, machine learning: numpy, matplotlib, sk-learn / Github for submitting project code 课程简介:Machine learning is a rapidly growing field at the intersection of comp

24、uter science and statistics concerned with finding patterns in data. It is responsible for tremendous advances in technology, from personalized product recommendations to speech recognition in cell phones. This course provides a broad introduction to the key ideas in machine learning. The emphasis w

25、ill be on intuition and practical examples rather than theoretical results, though some experience with probability, statistics, and linear algebra will be important. 机器学习是一个在与数据查找模式有关的计算机科学与统计的交集中快速增长的领域。它是负责技术的巨大进步,从个性化的产品推荐到手机的语音识别。本课程在机器学习的主要观点方面提供了广阔的介绍。重点将放在直觉和实际的例子,而不是理论成果,但与概率、统计和线性代数有关的一些经验

26、将是重要的。5. Visualizing and Communicating Data可视化和数据通信 技能:Exploratory data analysis / Effective written communication / Effective visual presentation of data / Design for human perception 探索性数据分析/有效的书面沟通/数据的有效视觉呈现/人类感知设计工具:Tableau / Javascript / D3 / Illustrator / R/ggplot2 / Highcharts / Visit课程简介:Com

27、municating clearly and effectively about the patterns we find in data is a key skill for a successful data scientist. This course focuses on the design and implementation of complementary visual and verbal representations of patterns and analyses in order to convey findings, answer questions, drive

28、decisions, and provide persuasive evidence supported by data. Assignments will give students hands-on experience with designing and building data visualizations as well as reporting their findings in prose. 对在数据中所发现的模式进行清楚而有效的沟通是成功的数据科学家的一个重要技能。本课程的重点是设计和实施模式和分析互补的视觉和口头交涉,以传达调查结果、回答问题、推动决策并提供了数据支持的有

29、说服力的证据。作业会让学生通过设计和建立数据可视化进行动手实验,以及报告他们在实践经验中的发现。6. Field Experiments 现场实验 技能:Experimental design/ Statistical analysis / Communicating results / Cleaning data / Mining and exploring data实验设计/统计分析/沟通结果/清理数据/挖掘和探索数据 工具:R 课程简介:This course introduces students to experimentation in the social sciences. T

30、his topic has increased considerably in importance since 1995, as researchers have learned to think creatively about how to generate data in more scientific ways, and developments in information technology has facilitated the development of better data gathering. Key to this area of inquiry is the i

31、nsight that correlation does not necessarily imply causality. In this course, we learn how to use experiments to establish causal effects, and how to be appropriately skeptical of findings from observational data. 本课程向学生介绍在社会科学中的实验。自1995年以来这一话题已经大大增加了重要性,研究人员已经学会创造性地去思考如何用更科学的方式来生成数据以及信息技术的发展推动了更好的数

32、据收集的发展。探究这一领域的关键是洞察关联并不意味着因果关系。在这个过程中,我们学会了如何使用实验建立因果效应,以及如何从发现的数据中进行适当怀疑。7. Legal, Policy, and Ethical Considerations for Data Scientists数据科学家的法律,政策和伦理问题技术:Ethical and legal frameworks / Policy analysis / Oral and written presentation 道德和法律框架/政策分析/口头和书面陈述课程简介:This course provides an introduction to

33、 the legal, policy, and ethical implications of data. The course will examine legal, policy, and ethical issues that arise throughout the full life cycle of data science from collection, to storage, processing, analysis and use including, privacy, surveillance, security, classification, discriminati

34、on, decisional-autonomy, and duties to warn or act. Case studies will be used to explore these issues across various domains such as criminal justice, national security, health, marketing, politics, education, automotive, employment, athletics, and development. Attention will be paid to legal and po

35、licy constraints and considerations that attach to specific domains as well as particular data-types, collection methods, and institutions. Technical, legal, and market approaches to mitigating and managing discrete and compound sets of concerns will be introduced, and the strengths and benefits of

36、competing and complementary approaches will be explored. 本课程介绍了数据的法律、政策和伦理问题。该课程将研究出现在数据科学整个生命周期中的法律、政策以及伦理问题,从收集到存储、处理、分析和利用,包括隐私、监控、安防、分类、识别、自主性决策和以及警告或行为的职责。案例研究将被用于探索在各个领域这些问题,如刑事司法、国家安全、健康、市场营销、政治、教育、汽车、就业、体育和发展。需要关注与特定领域和特定数据类型、收集方式和制度有关的法律和政策限制和注意事项。课程将介绍技术,法律和市场办法以缓和及管理独立和复合的组织的顾虑,以及探讨竞争和互补方

37、法的优势与好处。8. Scaling Up! Really Big Data 扩大!真正的大数据 技能:Working with data at scale 与大规模数据工作 工具:D-Streams / Apache Pig / OpenStack components and OpenStack Heat specifically / CloudSoft Brooklyn / Apache Storm 课程简介:This course provides a hands-on introduction to very large-scale data and the practical is

38、sues surrounding how the data is stored, processed and analyzed. Students will work with Cloud Computing systems, large data collections and high velocity data streams. The class material will be introduced gradually as it helps students accomplish their projects and assignments throughout the cours

39、e. Hands-on activities will enable the students to learn the practical toolkit of a big data specialist, e.g . Hadoop, Apache Spark, NoSQL databases, distributed file systems, large scale object storage systems and many others. 本课程提供了对很大规模的数据以及数据是如何存储、处理和分析的实际问题环境的一个实际操作介绍。学生将用云计算系统、大型数据集合和高速数据流进行工作

40、。该类材料将逐步介绍,因为它有助于学生在整个过程中完成他们的项目和任务。实际操作活动使学生学会大数据专家的实用工具,如 Hadoop、Apache Spark、NoSQL数据库、分布式文件系统、大规模的对象存储系统等等。9. Synthetic Capstone Course综合毕业设计 技能:Project scoping, planning and management / Data acquisition and analysis / Communication / Teamwork / Influence in organizations / Design thinking for d

41、ata science 课程简介:In the capstone class, students will synthesize technical, analytic, interpretive, and social dimensions to design and execute a full data science project in which they develop and demonstrate their skills at synthesis. The final project is designed to integrate all of the core skills and concepts learned throughout the program and prepare students to compete in the professional data science job market. It provides experience in formulating and carrying out a sustained, coherent, and significant course of work resulting in a tangibl

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论