Data Wrangling in R Using dplyr and tidyr

This workshop is intended for people who feel comfortable with the material covered in our Introductory Statistics Using R workshop. The dplyr and tidyr packages are versatile tools for efficiently manipulating datasets in the free statistical software package R. These packages employ a constrained number of “verbs” which are functions that correspond to the most common data manipulation tasks. The verbs are designed to be intuitive and to help users translate their thoughts into code. In this workshop we will use a series of hands-on examples to cover:

  • Sub-setting by observations or by variables
  • Creating new variables as functions of existing variables
  • Aggregation/summarizing observations
  • Reshaping (pivoting) from wide to tall or from tall to wide formats
  • Concatenating data from several tables
  • Merging/joining data from several tables

This workshop is designed to follow the Data Wrangling workshop, which is a lecture that covers conceptual methods that can be applied in any software.