RNA-Seq for DE analysis
5
3
Entering edit mode
8.7 years ago
debitboro ▴ 270

Hi all,

I have RNA-Seq PE data obtaining from the Illumina sequencing of 40 tumor tissues and their corresponding normal tissues (so, I have 2x2x40 = 160 fastq.gz files). I want to perform a DE analysis to detect the differences in expression between the normal and tumor tissues, so I ask for your help to propose me a convenient pipeline to use in such situation.

Thanks for all

RNA-Seq Differential Expression Analysis • 5.1k views
ADD COMMENT
0
Entering edit mode

May be this paper will help you to understand the entire picture.

http://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1004393

ADD REPLY
1
Entering edit mode
8.7 years ago
fernardo ▴ 180

I expect you already know how to come from alignment to gene counts.

Then use the following R packages to do DE analysis:

1- DEseq2

or

2- EdgeR

This pipeline/paper would definitely be a great help. Check it out.

Hope it helps.

ADD COMMENT
1
Entering edit mode
8.7 years ago
Pam ▴ 30

Hi debitboro,

I assume that you just have the data and need help in analyzing them right from the beginning ?!

Ok first you need to quantify them. I personally use Sailfish which is very fast.

The count data obtained from Sailfish outfile can be used in some DE pipelines/packages (like DESeq2) as fernardo suggested.

ADD COMMENT
2
Entering edit mode

I would propose Salmon here which is much more updated version and and made by the same lab, and you can get estimated values of the raw counts for all your samples. It will be in a matter of few hours that you will get both the expression values (TPM) and Raw counts for each samples. Make a matrix for both TPM and Raw counts and then put the raw count to nearest integer by rounding in R and then you can use your desired tool for DE analysis, be it edgeR or DESeq2. It entirely depends on the user.

ADD REPLY
2
Entering edit mode

Thanks for the mention, vchris_ngs! I should note here that, though Salmon includes features (and models certain types of bias like non-uniform read start distributions) that are not available in Sailfish, I still actively maintain Sailfish and backport the most relevant improvements from Salmon. This means that both Sailfish and Salmon should give highly accurate estimates very quickly. I intend to support and update both pieces of software as long as there is a user-base interested in me doing so, though I generally expect fancy new features to hit Salmon before Sailfish ;P.

ADD REPLY
1
Entering edit mode

I am always interested in lightning fast methods that can help me do my DE analysis and then focus largely on the downstream analysis of the DE genes and your methods serves the purpose of giving me both expression and raw read counts. If the OP needs a helper script for creating a matrix file from all samples can write me here and I can provide.

ADD REPLY
0
Entering edit mode

Actually, you should not use Sailfish output directly for DESeq2 (see discussion here: https://support.bioconductor.org/p/63103/ ). You need to do some additional processing with something like tximport ( http://bioconductor.org/packages/devel/bioc/vignettes/tximport/inst/doc/tximport.html ).

ADD REPLY
1
Entering edit mode
8.7 years ago
CandiceChuDVM ★ 2.5k

A conventional start would be to play with the Tuxedo suite following the instruction in the paper "Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks". It will guide you through the beginning of mapping to visualize the data.
The Tuxedo suite is as below: enter image description here

However, please be aware of the updated Tuxedo suite tools (e.g. Bowtie2, Hisat2, StringTie, Ballgown).

If you don't mind, I have collected some online courses and papers in my consistently updated post:
Up-to-date RNA Sequence Analysis Training/Courses/Papers?

ADD COMMENT
2
Entering edit mode

As an author of Cufflinks I strongly recommend that you switch to kallisto http://www.nature.com/nbt/journal/vaop/ncurrent/full/nbt.3519.html and sleuth for differential analysis. You can start here http://pachterlab.github.io/sleuth/ with an introduction and example here https://rawgit.com/pachterlab/sleuth/master/inst/doc/intro.html

ADD REPLY
0
Entering edit mode

Isn't kallisto/sleuth doing everything on the transcript level? Most biologists expect the results on gene level. That's the big caveat.

ADD REPLY
0
Entering edit mode

Using sleuth it is straightforward to examine quantification at the gene level. In a forthcoming release imminent there will be ability to perform differential analysis directly at the gene level as well as well.

ADD REPLY
0
Entering edit mode

Great! Do you have an estimate of when that will be available?

ADD REPLY
0
Entering edit mode
8.7 years ago
BioRyder ▴ 220

Hello,

The below biostars post will help you to know about RNA seq and DE. A: Up-to-date Online RNA Sequence Analysis Training/Courses/Papers?

ADD COMMENT
0
Entering edit mode
8.7 years ago
phil.chapman ▴ 100

I would recommend reading the F1000R article below by Mike Love (author of DESeq2) and Simon Anders (DESeq) which is a detailed workflow for analysing RNAseq data written by two o fthe leaders in the field:

http://f1000research.com/articles/4-1070/v1

ADD COMMENT

Login before adding your answer.

Traffic: 2000 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6