Hi All, This is my first time posting and have more experience with the wet-lab part of things, so please let me know how I can improve this post. I have checked the forum archives for an answer, but have not found it.
We are trying to sequence cell-specific RNA from mouse spleen and compare to whole spleen. Because these samples are low amount (50 ng total) and quality (RIN 3.5), we sent one of each (cell-specific and whole spleen) for a pilot library construction and a 150 bp, paired end MiSeq run . We do not expect the quality of the data to be great, but we are trying to determine if it is good enough for deeper sequencing. FastQC showed decent read quality, but a really high GC content throughout (60-80%), which is exacerbated at the beginning of the read. There is particular depletion of T throughout. There are also several overrepresented sequences and only 45% overall read mapping rate on Tophat. My questions are: 1. What is the likely cause of the high GC content? 2. Is it worth sequencing deeper?
Thank you,
Please use a different mapper than TopHat.
HISAT2 if you want to stay with same developers, STAR or BBMap are all current options. For those reads that don't appear to align (AFTER you scan/trim the data for presence of adapters) take a sample and BLAST at NCBI to make sure you don't have unforeseen contamination.
Hi genomax2, Thanks for your suggestions, we will use a different mapper. However, the really high GC content remains. I am trying to determine whether the problem is at the sample level, library construction or sequencing. Any ideas?