GC content for RNA-Seq data
1
0
Entering edit mode
7.4 years ago
sukesh1411 ▴ 30

Hi

I am new to the rna-seq data analysis. I have rna-seq data of leaf. I would like to perform transcriptome assembly using trinity. Before doing it, i removed the adapters and low quality bases. When i check the quality of reads in FASTQC it shows error for per base sequence content, per base GC content and sequence duplication levels. I dont know how to normalize this data before going to perform denovo assembly

Fastqc results

Thanks

RNA-Seq • 8.0k views
ADD COMMENT
1
Entering edit mode

FastQC is probably overreacting again, doesn't look like something to worry about in my opinion.

ADD REPLY
4
Entering edit mode
7.4 years ago
GenoMax 147k

Don't do anything. That is very characteristic signature of RNAseq data in FastQC. If you were over cautious in removing low quality bases then dial back on that as well. You probably need to filter at Q20 or below if you truly have bad quality data.

ADD COMMENT
0
Entering edit mode

Hi genomax, just wondering, is it necessary to account for GC content when doing the differential expression analysis? Based on what it says on this article

ADD REPLY
1
Entering edit mode

In general, no. it's not. Because you are comparing the expression of gene X with gene X across samples, they will all have the same GC content.

If you would compare gene X (GC=35% ) with gene Y (GC=76%) then that would matter, but that's not the scope of differential expression analysis.

ADD REPLY
0
Entering edit mode

Thank you for your answer Wouter :)

ADD REPLY
0
Entering edit mode

I guess it depends whether there is a GC bias between different sequencing samples

ADD REPLY

Login before adding your answer.

Traffic: 1395 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6