Very high GC content in RNA-seq
1
1
Entering edit mode
7.7 years ago
gsuarezm ▴ 10

Hi All, This is my first time posting and have more experience with the wet-lab part of things, so please let me know how I can improve this post. I have checked the forum archives for an answer, but have not found it.

We are trying to sequence cell-specific RNA from mouse spleen and compare to whole spleen. Because these samples are low amount (50 ng total) and quality (RIN 3.5), we sent one of each (cell-specific and whole spleen) for a pilot library construction and a 150 bp, paired end MiSeq run . We do not expect the quality of the data to be great, but we are trying to determine if it is good enough for deeper sequencing. FastQC showed decent read quality, but a really high GC content throughout (60-80%), which is exacerbated at the beginning of the read. There is particular depletion of T throughout. There are also several overrepresented sequences and only 45% overall read mapping rate on Tophat. My questions are: 1. What is the likely cause of the high GC content? 2. Is it worth sequencing deeper?

Thank you,

RNA-Seq rna-seq • 5.0k views
ADD COMMENT
0
Entering edit mode

Please use a different mapper than TopHat.

HISAT2 if you want to stay with same developers, STAR or BBMap are all current options. For those reads that don't appear to align (AFTER you scan/trim the data for presence of adapters) take a sample and BLAST at NCBI to make sure you don't have unforeseen contamination.

ADD REPLY
0
Entering edit mode

Hi genomax2, Thanks for your suggestions, we will use a different mapper. However, the really high GC content remains. I am trying to determine whether the problem is at the sample level, library construction or sequencing. Any ideas?

ADD REPLY
0
Entering edit mode
7.7 years ago
Ben ▴ 60

First you should trim adaptors. Then check the quality of reads. maybe you will find sth different.

ADD COMMENT
0
Entering edit mode

Thanks for the comment Ben, We did trim the adaptors, but the high GC content was throughout the read.

ADD REPLY
0
Entering edit mode

Do you see the high GC-content in the 45% mapped reads or in all the reads ? If it is in the unmapped reads, then it comes probably from contamination.

ADD REPLY

Login before adding your answer.

Traffic: 2548 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6