Skip to main content

Reproducible Research and Automatic Reports

A major principle of the scientific method is replication, or the ability of an experiment or study to be duplicated. Similar to this is the idea of "reproducible research," that the results of a study can be reproduced given the raw data and the analysis protocols or methods (usually in code, script, or syntax form). This full transparency allows the reader the ability to evaluate all the analysis decisions that led to the study conclusion.

The concept of reproducible research has gained recent attention in the research world due to the rise of the number of published articles, the increasing complexity of data analysis, and the alarming recent rise in retraction rates.

Using a well-maintained, reproducible data analysis work flow is also a major productivity boost. Using a syntax-based approach permanently connects the protocol documentation to the actions performed to produce the results, reducing confusion as to what steps were taken. Figures and tables can be recreated easily, allowing for simple modification without starting from scratch. Additionally, identical analyses can be easily performed on similar data sets, reducing the time needed to repeat analyses on updated data or data that is produced regularly (e.g. daily, weekly or annually).

This workshop will cover:

  • an overview of reproducible research
  • how to plan and execute data management and analysis work flows
  • the advantages of working within a scripted or syntax environment
  • tools for producing automatic reports in several software packages

Fee: None to members of the Cornell community, but registration is required. Since space is limited, early registration is encouraged.

For times and locations of upcoming workshops, please see the Workshop Schedule.