Statistics and R for the Life Sciences from Harvard University
An introduction to basic statistical concepts and R programming skills necessary for analyzing data in the life sciences.
We will learn the basics of statistical inference in order to understand and compute p-values and confidence intervals. We will provide examples by programming in R in a way that will help make the connection between concepts and implementation. Problems sets requiring R programming will be used to test understanding and ability to implement basic data analyses. We will use visualization techniques to explore new data sets and determine the most appropriate approach. We will describe robust statistical techniques as alternative when data do not fit assumptions required by the standard approaches. We will also introduce the basics of using R scripts to conduct reproducible research.
Topics:
- Distributions
- Exploratory Data Analysis
- Inference
- Non-parametric statistics
This 5-week course is the first in an eight part series on Data Analysis for Genomics (version 2)
PH525.1x: Statistics and R for the Life Sciences
PH525.2x: Introduction to Linear Models and Matrix Algebra
PH525.3x: Advanced Statistics for the Life Sciences
PH525.4x: Introduction to Bioconductor
PH525.5x: Case study: RNA-seq data analysis
PH525.6x: Case study: Variant Discovery and Genotyping
PH525.7x: Case study: ChIP-seq data analysis
PH525.8x: Case study: DNA methylation data analysis
Version 2 of Data Analysis for Genomics is based on the book of version 1 and a lot of feedback, including this one.