What GC content should I expect?
0
0
Entering edit mode
8 months ago
BioinfGuru ★ 2.1k

Hi all,

I am analysing a large Pig RNA-seq dataset of over 200 samples across 5 tissues. According to NCBI, the GC content of the Pig genome is 42%. I understand RNA-seq data should return higher GC content than whole genome data

During testing of the pipeline on a few random samples, the fastqc graphs returned a normal distribution peaking around 50%, with 1 sample shifted slightly left at around 47%, and over-represented sequences returned were just adapters.

So I have 2 questions:

1) What should I expect the RNA-seq GC content to be? My guess is ~ 47% (genome GC + 5%). Should I be concerned if all samples return a GC content of 50%

2) If there are a small number of samples showing a lower GC content by a few percent than the rest, should they be removed from the analysis? How should that be handled? Is a few % nothing to be concerned about?

Thanks in advance,

Kenneth

RNA-seq GC-content • 444 views
ADD COMMENT
1
Entering edit mode

My recommendation is to ignore meaningless metrics such as GC content and focus on relevqnt QC. That is mapping rate and how samples look downstream, e.g. in PCA to assess group separation.

ADD REPLY
0
Entering edit mode

Appreciate it, thanks. This did feel rather pedantic.

ADD REPLY

Login before adding your answer.

Traffic: 2079 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6