R for Data Science: Import, Tidy, Transform, Visualize, and Model Data

by Hadley Wickham (Author), Garrett Grolemund (Author)

Description

"This book introduces you to R, RStudio, and the tidyverse, a collection of R packages designed to work together to make data science fast, fluent, and fun. Suitable for readers with no previous programming experience"--Page 4 of cover.

Recommendations

Practical Statistics for Data Scientists: 50 Essential Concepts by Peter Bruce

Practical Statistics for Data Scientists: 50 Essential Concepts

Peter Bruce

Hands-On Programming with R: Write Your Own Functions and Simulations by Garrett Grolemund

Hands-On Programming with R: Write Your Own Functions and Simulations

Garrett Grolemund

Advanced R. 2nd edition by Hadley Wickham

Advanced R. 2nd edition

Hadley Wickham

Mastering Regular Expressions by Jeffrey E. F. Friedl

Mastering Regular Expressions

Jeffrey E. F. Friedl

Data Science from Scratch: First Principles with Python by Joel Grus

Data Science from Scratch: First Principles with Python

Joel Grus

Learning R: A Step-by-Step Function Guide to Data Analysis by Richard Cotton

Learning R: A Step-by-Step Function Guide to Data Analysis

Richard Cotton

Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython by Wes McKinney

Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython

Wes McKinney

An Introduction to Statistical Learning: with Applications in R by Gareth James

An Introduction to Statistical Learning: with Applications in R

Gareth James

R in Action by Robert Kabacoff

R in Action

Robert Kabacoff

ggplot2: Elegant Graphics for Data Analysis by Hadley Wickham

ggplot2: Elegant Graphics for Data Analysis

Hadley Wickham

R Cookbook by Paul Teetor

R Cookbook

Paul Teetor

Python Data Science Handbook: Essential Tools for Working with Data by Jake VanderPlas

Python Data Science Handbook: Essential Tools for Working with Data

Jake VanderPlas

Doing Data Science: Straight Talk from the Frontline by Cathy O'Neil

Doing Data Science: Straight Talk from the Frontline

Cathy O'Neil

The C Programming Language (2nd Edition) by Brian W. Kernighan

The C Programming Language (2nd Edition)

Brian W. Kernighan

R Graphics Cookbook: Practical Recipes for Visualizing Data by Winston Chang

R Graphics Cookbook: Practical Recipes for Visualizing Data

Winston Chang

The Art of R Programming: A Tour of Statistical Software Design by Norman Matloff

The Art of R Programming: A Tour of Statistical Software Design

Norman Matloff

The Visual Display of Quantitative Information by Edward R. Tufte

The Visual Display of Quantitative Information

Edward R. Tufte

Data Science at the Command Line: Facing the Future with Time-Tested Tools by Jeroen Janssens

Data Science at the Command Line: Facing the Future with Time-Tested Tools

Jeroen Janssens

Data Mining: Practical Machine Learning Tools and Techniques by Ian H. Witten

Data Mining: Practical Machine Learning Tools and Techniques

Ian H. Witten

The R Book by Michael J. Crawley

The R Book

Michael J. Crawley

Introduction to Algorithms by Thomas H. Cormen

Introduction to Algorithms

Thomas H. Cormen

Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems by Aurélien Géron

Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems

Aurélien Géron

Design Patterns: Elements of Reusable Object-Oriented Software by Erich Gamma

Design Patterns: Elements of Reusable Object-Oriented Software

Erich Gamma

The Pragmatic Programmer: From Journeyman to Master by Andrew Hunt

The Pragmatic Programmer: From Journeyman to Master

Andrew Hunt

The Elements of Statistical Learning: Data Mining, Inference, and Prediction by Trevor Hastie

The Elements of Statistical Learning: Data Mining, Inference, and Prediction

Trevor Hastie

R Cookbook: Proven Recipes for Data Analysis, Statistics, and Graphics by J. D. Long

R Cookbook: Proven Recipes for Data Analysis, Statistics, and Graphics

J. D. Long

Code Complete: A Practical Handbook of Software Construction by Steve McConnell

Code Complete: A Practical Handbook of Software Construction

Steve McConnell

Text Mining with R: A Tidy Approach by Julia Silge

Text Mining with R: A Tidy Approach

Julia Silge

Discovering Statistics Using R by Andy Field

Discovering Statistics Using R

Andy Field

Practical Data Science with R by Nina Zumel

Practical Data Science with R

Nina Zumel

Applied Predictive Modeling by Max Kuhn

Applied Predictive Modeling

Max Kuhn

Programming Collective Intelligence: Building Smart Web 2.0 Applications by Toby Segaran

Programming Collective Intelligence: Building Smart Web 2.0 Applications

Toby Segaran

R in a Nutshell: A Desktop Quick Reference by Joseph Adler

R in a Nutshell: A Desktop Quick Reference

Joseph Adler

Think Stats by Allen B. Downey

Think Stats

Allen B. Downey

Data Science for Business: What You Need to Know about Data Mining and Data-Analytic Thinking by Foster Provost

Data Science for Business: What You Need to Know about Data Mining and Data-Analytic Thinking

Foster Provost

R for Everyone: Advanced Analytics and Graphics by Jared P. Lander

R for Everyone: Advanced Analytics and Graphics

Jared P. Lander

Naked Statistics: Stripping the Dread from the Data by Charles Wheelan

Naked Statistics: Stripping the Dread from the Data

Charles Wheelan

Code: The Hidden Language of Computer Hardware and Software by Charles Petzold

Code: The Hidden Language of Computer Hardware and Software

Charles Petzold

The Art of Statistics: How to Learn from Data by David Spiegelhalter

The Art of Statistics: How to Learn from Data

David Spiegelhalter

Data Analysis Using Regression and Multilevel/Hierarchical Models by Andrew Gelman

Data Analysis Using Regression and Multilevel/Hierarchical Models

Andrew Gelman

Machine Learning for Hackers by Drew Conway

Machine Learning for Hackers

Drew Conway

The Art of Computer Programming, Volume 1: Fundamental Algorithms by Donald E. Knuth

The Art of Computer Programming, Volume 1: Fundamental Algorithms

Donald E. Knuth

Fluent Python: Clear, Concise, and Effective Programming by Luciano Ramalho

Fluent Python: Clear, Concise, and Effective Programming

Luciano Ramalho

Statistical Rethinking: A Bayesian Course with Examples in R and Stan by Richard McElreath

Statistical Rethinking: A Bayesian Course with Examples in R and Stan

Richard McElreath

The Model Thinker: What You Need to Know to Make Data Work for You by Scott E. Page

The Model Thinker: What You Need to Know to Make Data Work for You

Scott E. Page

Data Visualization: A Practical Introduction by Kieran Healy

Data Visualization: A Practical Introduction

Kieran Healy

Interactive Data Visualization for the Web by Scott Murray

Interactive Data Visualization for the Web

Scott Murray

Deep Learning by Ian Goodfellow

Deep Learning

Ian Goodfellow

Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems by Martin Kleppmann

Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems

Martin Kleppmann

The Signal and the Noise: Why So Many Predictions Fail-But Some Don't by Nate Silver

The Signal and the Noise: Why So Many Predictions Fail-But Some Don't

Nate Silver

Member Recommendations

ggplot2: Elegant Graphics for Data Analysis by Hadley Wickham

sergiouribe Data science + data visualization

Complete Recommendations

Member Reviews

Featured Recent Thumbs

All Ratings
8
4
1

5 reviews

scottjpearson 906 reviews

If the above quote is the mission of this book, consider the task accomplished. Where most books in computer science fall down in trying to be cute while communicating an educational message, this book addresses the task of education about R squarely, and it does so in a manner that engages the mind with interesting problems.

Usually, I skip the exercises sections of most computer books because, well, they offer challenges that are underwhelming. Recall is all that is required to answer them. Usually, I can figure them out in the confines of my mind so that I don't have to waste my time looking up the answers or coding example code to check whether I'm right or where I err.

Not so for Hadley Wickham. Many of his questions were awakened my show more curiosity and had me applying me new knowledge in R Studio immediately. In fact, the only way I could answer my burning curiosity was to write code in order to test my hypotheses.

Rare is the computer book that is a page turner. This book qualifies as just that if one has the aptitude in statistics to embrace the challenges. R is an ideal language to handles these challenges in statistics, and Wickham and Grodemund fill the role of ideal apostles/evangelists to share this free fruit.

The fun part about R is that it is free, creative, and well-supplied with packages to solve interesting statistical problems. This book carries that message squarely to my lap (and then to my brain) in an engaging manner. show less

Jan 25, 2020

encephalical 334 reviews

This is one of the best O'Reilly books I've read. For context, I'm a graphics programmer that fell into sci vis, e.g., visualizing fluid simualtions, and is now pivoting into info vis.

Part I: Explore gives an overview of using R+ggplot2+some tidyverse to do exploratory data analysis. It is one of the best intro overview dives I've come across for any type of programming. Most dives of this sort have at least one or two gaps in material or unclear motivation or try to do too much. This was perfectly crafted to lead someone into the tidyverse.

Part II: Wrangle is a more thorough look at the tidyverse. I recommend supplementing this by reading Wickham's original paper on tidy data.

Part III: Program was a little tedious because I already show more have decades of programming experience, though the coverage of purrr is interesting.

Part IV: Model covers building linear and non- models. I don't have a statistics background but even so found this easy to follow and very clear.

Part V: Communicate is a smorgasbord of R Markdown and options building on top of it. I thought this section had a bit of a conflicting message to end on, because after 400 some pages of doing work in RStudio with .R script files, the authors all of a sudden seem to say to forget all that and do everything as R Markdown. Which is fine, but if that's their recommendation I think introducing that earlier would have been better.

There are some copy editing issues, luckily Wickham has an updated online edition with corrections. Some of the exercises weren't entirely clear as to intent, but that could entirely be do to my lacking stats background. (Plenty of people have posted solutions online if you get stuck.) show less

Mar 1, 2019 (Edited)

markm2315 726 reviews

Like a week-long workshop with the authors, this book presents data analysis in terms of the R packages in the tidyverse. I don't think you can read it and fail to learn a lot. It has an especially nice organized approach to data import and non-tidy data. I think I would recommend it to almost anyone who does some data analysis. My only caveat would be that although you could start learning R with this book, it might be a difficult and non-traditional path for some complete beginners.

Jul 1, 2023

sergiouribe 100 reviews

El mejor libro para ciencia de datos por el Wickham, creador de todo un nuevo lenguaje que permite remodelar, visualizar y resumir datos para extraer de ellos información.

He tomado varios cursos de Grolemund y destaca que va de lo simple a lo complejo. Por ejemplo, el curso de HarvardX comienza con...FUNCIONES. Hay algunos que ocupamos R para procesar cantidades pequeñas de datos, como en estudios epidemiológicos o clínicos, en comparación a quienes procesan datos de Facebook o Google, que son TB de información. En este libro las funciones vienen en la parte 15. O sea, este libro va enseñando realmente de menos a más, comenzando con lo fácil y simple para llegar a lo difícil y complejo, pero usualmente más útil.

Por show more ejemplo, en R Base ordenar sería algo como

df[order(df$recuento,decreasing=TRUE), ]

mientras que con dplyr sería
arrange(df, desc(recuento))

lo que un humano puede leer: ordenar (la base de datos, en forma descendente mediante la variable Count.
El hecho que ahora pueda prescindir de los [] permite agilizar mucho cualquier escritura de código.

La calidad del libro es perfecta, con varios colores que resaltan distintas partes de los códigos para indicar como funcionan.

Es un libro indispensable para cualquiera que tenga que analizar datos. show less

Oct 29, 2017

sashame 306 reviews

if ur gonna use R, its probably the best resource out there, aside from wickham's advanced R guide; but the language is so antiquated and outdated, with so many issues in its fundamental data structures, that its frustrating to pretend it can b an elegant front end for research development; it seems like wickham's energy would b better directed towards developing an R2 (couldn't u just ship a wrapper to make R1 packages compatible?) rather than trying to patch R as it is w more and more packages to try to smooth things over

Jun 8, 2019

All Reviews

Members

Recently Added By: gertjw, Echang, icrbooks, bbtp, dfo_1906, palm_ojl, bdkl, yoyomel

All Members

Author Information

Hadley Wickham

Author

6+ Works 529 Members

Hadley Wickham is Chief Scientist at RStudio, an Adjunct Professor at Stanford University and the University of Auckland, and a member of the R Foundation. He is the lead developer of the tidyverse, a collection of R packages, including ggplot2 and dplyr, designed to support data science. He is also the author of R for Data Science (with Garrett show more Grolemund), R Packages, and ggpiot2: Elegant Graphics for Data Analysis. show less

Garrett Grolemund

Author

2 Works 358 Members

Garrett Grolemund is a statistician, teacher, and R developer who works as a data scientist and Master Instructor at RStudio. Garrett received his PhD at Rice University, where his research traced the origins of data analysis as a cognitive process and identified how attentional and epistemological concerns guide every data analysis.

Work Relationships

Is expanded in

R for Data Science: Import, Tidy, Transform, Visualize, and Model Data [2nd edition] by Hadley Wickham

Common Knowledge

Original publication date: 2017
Original language*: Englisch

*Some information comes from Common Knowledge in other languages. Click "Edit" for more information.

Statistics

Members: 294
Popularity: 109,253
Reviews: 5
Rating: ½ (4.56)

Languages: English, German
Media: Paper, Ebook
ISBNs: 12
ASINs: 4

R for Data Science: Import, Tidy, Transform, Visualize, and Model Data

by Hadley Wickham (Author), Garrett Grolemund (Author)

On This Page

Description

Tags

Recommendations

Member Recommendations

Member Reviews

Members

Author Information

Work Relationships

Is expanded in

Common Knowledge

Classifications

Statistics

Quick Facts

Author

Popular Covers

Find It

Links

Popularity

Ratings

Helpers