RNA-seq's strange genes
1
0
Entering edit mode
4 days ago
SeoG • 0

Hello I'm first anlysis in bulk RNA-seq.

I have multiple replicates of the same sample, and there are genes where the read counts vary by more than 10-fold. Should I filter out these genes?

Thank you

Rna-seq • 282 views
ADD COMMENT
2
Entering edit mode

More context is needed. What are the genes for? What is the range of read counts? Expression from 10 reads to 100 reads among replicates could be explained biologically. What is the variation in total number of reads per replicate?

Generally, I would say removing data because the variation doesn't agree with your a priori assumptions is a bad start, but it's impossible to say in this example without more information.

ADD REPLY
7
Entering edit mode
3 days ago

The post made me think about just how many times one ought to find 10X fold change data in a realistic RNA-Seq experiment.

So I went ahead and simulated realistic RNA-SEQ counts with the PROPER library:

Published as "Wu H, Wang C, Wu Z (2014). "PROPER: Comprehensive Power Evaluation for Differential Expression using RNA-seq." Bioinformatics."

In my run with 3 replicates, out of 20K genes, 24 genes had a fold change of over 10x defined as abs(Avg(A)/Avg(B)).

9 out of 24 were false positives, and the rest were true positives.

In conclusion, I would let the statistical method sort it out and then investigate the results for those that seem unexpected rather than filtering out a priori.

ADD COMMENT
0
Entering edit mode

Thank you for your answer.

ADD REPLY

Login before adding your answer.

Traffic: 1303 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6