DGE analysis using stranded and unstranded RNA-seq libraries.
1
1
Entering edit mode
8.7 years ago
Sentinel156 ▴ 190

Hi all,

I am working with Illumina HiSeq 2000 100bp single end RNA-seq data. Some of my samples originate from unstranded libraries and some from stranded libraries. I'm trying to understand the best way to do read summarisation for these libraries using featurecounts for eventual DGE analysis. To date I have treated all datasets as unstranded for mapping (tophat) and counting (featurecounts).

However I am fearful that read counts for my unstranded libraries will be biased for genes which have antisense transcripts (since reads originating from the antisense transcript will be fused into the counts for the gene on the sense strand in positions that the two features overlap). So what is the recommended course of action here? I'm not interested in antisense transcripts so should i continue to treat everything as unstranded for the featurecounts run? I have seen some other threads here that suggest incorporating strandedness into the DGE calculation as a multi-factorial design but was hoping for a more thorough explanation of how this is the better workaround for this problem.

Thank you in advance.

RNA-Seq • 4.3k views
ADD COMMENT
9
Entering edit mode
8.7 years ago

Do you really think mapping stranded libraries as if they're unstranded and then doing the counted in an unstranded fashion gets rid of all possible bias? I expect not. That's why you'll see everyone suggesting to align each sample as appropriate (stranded or not, depending on the sample), doing the counting as appropriate (stranded or not, depending on the sample), and then adding a batch effect into the model (with an interaction term if you're really concerned, have a look at a PCA plot).

ADD COMMENT
4
Entering edit mode

+1 for your answer. By the way, I think the interaction term [batch:condition] is really needed here since antisense transcripts usually have opposite expression dynamics than their sense counterparts. Meaning that, in a condition, if a gene is overexpressed, there is a good chance that its antisense will be underexpressed. So the batch effect is expected to vary accross conditions, especially for the genes you are interested in, i.e, those who are differentially expressed accross conditions.

ADD REPLY
0
Entering edit mode

Hi, I do not think these statements are true "since antisense transcripts usually have opposite expression dynamics than their sense counterparts" and "in a condition, if a gene is overexpressed, there is a good chance that its antisense will be underexpressed". If you are talking about natural antisense transcripts (NATs) or non-coding antisense, it is not a general phenomenon where you always find anti-correlative expression. Because these expression concordance between sense and antisense is context dependent (tissue or cell type etc.,).

Examples:

The landscape of antisense gene expression in human cancers

A cautionary tale of sense-antisense gene pairs: independent regulation despite inverse correlation of expression

Genome-wide Identification and Characterization of Natural Antisense Transcripts

Genome-wide analysis of expression modes and DNA methylation status at sense–antisense transcript loci in mouse

Sense-Antisense lncRNA Pair Encoded by Locus 6p22.3 Determines Neuroblastoma Susceptibility

Conserved expression of natural antisense transcripts in mammals.

This is not a answer to the main question rather it is reply for the statement made in this post.

ADD REPLY
0
Entering edit mode

Well the situation is perhaps more complex in higher eukaryotes, but I think that in simpler systems, the anti-correlation between sense and anti-sense transcription is rather well established. There is for instance this recent paper:

Native elongating transcript sequencing reveals global anti-correlation between sense and antisense nascent transcription in fission yeast.

ADD REPLY
0
Entering edit mode

Hi,

My point was, there are evidence for both positive and negative correlation with good publications. So there is no general rule that sense and antisense are globally anti-correlated or positively correlated. There are many factors contributing to that (some times it is species dependent too).

Sorry one more reference, Antisense Transcription in the Mammalian Transcriptome

ADD REPLY

Login before adding your answer.

Traffic: 2511 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6