Introduction to genome-wide association studies (GWAS)
https://www.physalia-courses.org/courses-workshops/course49/
Berlin, 2-6 March 2020
General Topic: Bioinformatics pipeline for GWAS analysis
OVERVIEW
This course will introduce students, researchers and professionals to the steps needed to build an analysis pipeline for Genome-Wide Association Studies (GWAS). The course will describe all the necessary steps involved in a typical GWAS study, which will then be used to build a reusable and reproducible bioinformatics pipeline.
FORMAT
The course is structured in modules over five days. Each day will include introductory lectures with class discussions of key concepts. The remainder of each day will consist of practical hands-on sessions. These sessions will involve a combination of both mirroring exercises with the instructor to demonstrate a skill as well as applying these skills on your own to complete individual exercises. After and during each exercise, results will be interpreted and discussed in group.
TARGET AUDIENCE & ASSUMED BACKGROUND
The course is aimed at students, researchers and professionals interested in learning the different steps involved in a GWAS study using them to build a structured pipeline for semi-automated and reproducibile GWAS analyses. It will include information useful for both beginners and more advanced users. We will start by introducing general concepts of GWAS and bioinformatics pipeline building, progressively describing all steps and putting there seamlessy together in a general workflow. Attendees should have a background in biology, specifically genetics; previous exposure to GWAS experiments would also be beneficial. There will be a mix of lectures and hands-on practical exercises using R, Linux command line and custom software. Some basic understanding of R programming and Unix will be advantageous. Attendees should also have some basic familiarity with genomic data such as those arising from NGS experiments.
LEARNING OUTCOMES
*Understanding the different steps involved in a typical GWAS analysis and how to build them together in a general workflow / bioinformatics pipeline
Curriculum
Monday 9:30-17:30
9:30 Lecture 0 General Introduction / Overview of the course
10:00 Lecture 1 Introduction to GWAS: Linkage disequilibrium and Linear Regression
11:00 coffee break
11:15 Lecture 2 Introduction to GWAS: Linkage disequilibrium and Linear Regression
12:00 Lecture 3 GWAS: case studies / examples from literature
Lab 2 - part 1 basic Linux and R
13:00 lunch break
14:00 Lab 2 - part 2 Practicalities and set-up (server, github repo, conda envs, etc) and description of datasets
15:00 Lab 2 - part 3 basic Linux and R
15:30 coffee break
16:00 Lab 3 GWAS: basic models
16:30 Lab 3 (demonstration) GWAS: basic models (linear and logistic regression, population structure, etc.)
Tuesday 9:30-17:30
9:30 Lecture 4 EDA: theory
10:00 Lab 4 EDA practice
11:00 coffee break
11:15 Lecture 5 data preprocessing: theory
12:00 Lab 5 data preprocessing: practice
13:00 lunch break
14:00 Lecture 6 Imputation of missing genotypes: theory
15:00 Lab 6 - part 1 practical session on imputation (Beagle)
15:30 coffee break
16:00 Lab 6 - part 2 practical session on imputation (Beagle)
16:30 Lab 7 (demonstration) KNNI imputation
Wednesday 9:30-17:30
09:30 Lecture 7 GWAS, the full model (all SNPs)
11:00 coffee break
11:15 Lab 9 (demonstration) a few steps in the past (GenABEL)
12:00 Lab 10 GWAS: the stand-alone script(s) for the full model
13:00 lunch break
14:00 Lecture 8 GWAS: experimental design and statistical power
15:00 coffee break
15:30 Lecture 9 The multiple testing issue?
16:30 Lab 10 revising the steps involved in GWAS
Thursday 9:30-17:30
9:30 Lecture 10 Bioinformatics pipelines: a super-elementary introduction
10:30 Lab 11 Building a pipeline with Snakemake
11:15 coffee break
11:30 Lab 12 The GWAS pipeline for continuous phenotype
Lab 13 The GWAS pipeline for binary phenotype
13:00 lunch break
14:00 Lab 14 Introducing the assignment (+ light touch on RMarkdown)
14:30 Assignment (groups) Build your own GWAS pipeline on new data
15:30 coffee break
16:00 Assignment (groups) Build your own GWAS pipeline on new data
Friday 9:30-17:30
9:30 Assignment (groups) Build your own GWAS pipeline on new data
11:00 coffee break
11:30 Assignment (groups) Build your own GWAS pipeline on new data
12:30 Assignment (groups) Presenting and discussing results
13:00 lunch break
14:00 Assignment (groups) Presenting and discussing results
15:30 coffee break
16:00 Lecture 11 A light touch on post-GWAS analysis
17:00 Lecture 12 A glimpse on ROH-based alternative
Will there be an online section that people far away can Join to? I am very intersted in learning GWAS. Thanks