数据科学(影印版 英文) [(美)舒特,奥尼尔 著] 2014年版
资料介绍
数据科学(影印版 英文)
作 者: (美)舒特,奥尼尔 著
出版时间: 2014
内容简介
现在人们已经意识到数据可以让选举或者商业模式变得不同,数据科学作为一项职业正在不断发展。但是你应该如何在这样一个广阔而又错综复杂的交叉学科领域中开展工作呢?舒特、奥尼尔著的《数据科学(影印版)》这本书将会告诉你所需要了解的一切。它富有深刻见解,是根据哥伦比亚大学的数据科学课程的讲义整理而成。
目录
Preface
1. Introduction: What Is Data Science?
Big Data and Data Science Hype
Getting Past the Hype
Why Now?
Datafication
The Current Landscape (with a Little History)
Data Science lobs
A Data Science Profile
Thought Experiment: Meta-Definition
OK, So What Is a Data Scientist, Really?
In Academia
In Industry
2. Statistical Inference, Exploratory Data Analysis, and the Data Science
Process
Statistic.a1 Thinking in the Age of Big Data
Statistical Inference
Populations and Samples
Populations and Samples of Big Data
Big Data Can Mean Big Assumptions
Modeling
Exploratory Data Analysis
Philosophy of Exploratory Data Analysis
Exercise: EDA
The Data Science Process
A Data Scientist's Role in This Process
Thought Experiment: How Would You Simulate Chaos?
Case Study: RealDirect
How Does RealDirect Make Money?
Exercise: RealDirect Data Strategy
3. Algorithms
Machine Learning Algorithms
Three Basic Algorithms
Linear Regression
k-Nearest Neighbors (k-NN)
k-means
Exercise: Basic Machine Learning Algorithms
Solutions
Summing It All Up
Thought Experiment: Automated Statistician
4. Spare Filters, Naive Bayes, and Wrangling
Thought Experiment: Learning by Example
Why Won't Linear Regression Work for Filtering Spare?
How About k-nearest Neighbors?
Naive Bayes
Bayes Law
A Spare Filter for Individual Words
A Spam Filter That Combines Words: Naive Bayes
Fancy It Up: Laplace Smoothing
Comparing Naive Bayes to k-NN
Sample Code in bash
Scraping the Web: APIs and Other Tools
Jake's Exercise: Naive Bayes for Article Classification
Sample R Code for Dealing with the NYT API
5. Logistic Regression
Thought Experiments
Classifiers
Runtime
You
Interpretability
Scalability
M6D Logistic Regression Case Study
Chck Models
The Underlying Math
6.1ime Stamps and Financial Modeling
7.Extracting Meaning from Data
8.Recommendation Engines:Building a User-Facing Data Product at Scale
9.Data Visualization and Fraud Detection
10.SociaI Networks and Data Journalism
11.Causality
12.Epidemiology
13.Lessons Learned from Data Competitions:Data Leakage and Model Evaluation
14.Data Engineering:MapReduce,Pregel,and Hadoop
15.The Students Speak
16.Next-Generation Data Scientists,Hubris,and Ethics
Index
作 者: (美)舒特,奥尼尔 著
出版时间: 2014
内容简介
现在人们已经意识到数据可以让选举或者商业模式变得不同,数据科学作为一项职业正在不断发展。但是你应该如何在这样一个广阔而又错综复杂的交叉学科领域中开展工作呢?舒特、奥尼尔著的《数据科学(影印版)》这本书将会告诉你所需要了解的一切。它富有深刻见解,是根据哥伦比亚大学的数据科学课程的讲义整理而成。
目录
Preface
1. Introduction: What Is Data Science?
Big Data and Data Science Hype
Getting Past the Hype
Why Now?
Datafication
The Current Landscape (with a Little History)
Data Science lobs
A Data Science Profile
Thought Experiment: Meta-Definition
OK, So What Is a Data Scientist, Really?
In Academia
In Industry
2. Statistical Inference, Exploratory Data Analysis, and the Data Science
Process
Statistic.a1 Thinking in the Age of Big Data
Statistical Inference
Populations and Samples
Populations and Samples of Big Data
Big Data Can Mean Big Assumptions
Modeling
Exploratory Data Analysis
Philosophy of Exploratory Data Analysis
Exercise: EDA
The Data Science Process
A Data Scientist's Role in This Process
Thought Experiment: How Would You Simulate Chaos?
Case Study: RealDirect
How Does RealDirect Make Money?
Exercise: RealDirect Data Strategy
3. Algorithms
Machine Learning Algorithms
Three Basic Algorithms
Linear Regression
k-Nearest Neighbors (k-NN)
k-means
Exercise: Basic Machine Learning Algorithms
Solutions
Summing It All Up
Thought Experiment: Automated Statistician
4. Spare Filters, Naive Bayes, and Wrangling
Thought Experiment: Learning by Example
Why Won't Linear Regression Work for Filtering Spare?
How About k-nearest Neighbors?
Naive Bayes
Bayes Law
A Spare Filter for Individual Words
A Spam Filter That Combines Words: Naive Bayes
Fancy It Up: Laplace Smoothing
Comparing Naive Bayes to k-NN
Sample Code in bash
Scraping the Web: APIs and Other Tools
Jake's Exercise: Naive Bayes for Article Classification
Sample R Code for Dealing with the NYT API
5. Logistic Regression
Thought Experiments
Classifiers
Runtime
You
Interpretability
Scalability
M6D Logistic Regression Case Study
Chck Models
The Underlying Math
6.1ime Stamps and Financial Modeling
7.Extracting Meaning from Data
8.Recommendation Engines:Building a User-Facing Data Product at Scale
9.Data Visualization and Fraud Detection
10.SociaI Networks and Data Journalism
11.Causality
12.Epidemiology
13.Lessons Learned from Data Competitions:Data Leakage and Model Evaluation
14.Data Engineering:MapReduce,Pregel,and Hadoop
15.The Students Speak
16.Next-Generation Data Scientists,Hubris,and Ethics
Index
相关资料
- 基于工业互联网的SSM项目实战 物料订单管理系统 天津滨海迅腾科技集团有限公司 主编 2018年版
- 数据产品经理高效学习手册 产品设计、技术常识与机器学习 张威 2020年版
- 智慧中国 中国IT产业投资路线图 [尹沿技 著] 2012年版
- 最新数字媒体技术丛书 手机游戏产业与产品 [吴起 著] 2010年版
- 源码中国 全球IT外包新原点 [(瑞)埃尔钦汗 著] 2011年版
- 疯狂的站长 从穷站长到富站长 [温世豪 著] 2011年版
- 电竞简史 徐丽 2020年版
- 码链 大变局中遇见未来 徐蔚 2021年版
- 认识编程:以Python语言讲透编程的本质 郭屹 2021年版
- ChatGPT:读懂AI爆发背后的技术和产业逻辑 项立刚 2023年版