Question

GC content in DNA sequnce

1

Entering edit mode

6.9 years ago

jaafari.omid ▴ 80

Hello Dears. I am working on GBS data. I have trimmed my data but for some samples the FastQC showed an error related to %GC content. for example before trimming the value of GC content was 46% and after trimming it reached to %47. Actually before trimming it did not show error while its %GC content was lower than after trimming. what should be the value of GC content for DNA sequences such as GBS. Thanks in advanced. Best regards, Omid

genome • 2.5k views

ADD COMMENT • link 6.9 years ago by jaafari.omid ▴ 80

1

Entering edit mode

so the conclusion is your adapter was rich in GC ? :D %GC depends on your target and your organism ....

ADD REPLY • link 6.9 years ago by Titus ▴ 910

0

Entering edit mode

Don't be too worried about FastQC errors, they often don't make sense. The difference you see in GC content before and after trimming is very small. You also don't expect the percentage to be exactly 50% as it depends on the genome.

ADD REPLY • link 6.9 years ago by Martombo ★ 3.1k

0

Entering edit mode

Thanks dears for your answers to my question.

ADD REPLY • link 6.9 years ago by jaafari.omid ▴ 80

0

Entering edit mode

And Is it possible to guide me about Kmer? based on my information Kmer is not important for RNAseq data but I don't know how much is it important for DNA sequences like GBS. using Stacks pipeline I trimmed the raw data but after that I still have kmer content.

ADD REPLY • link 6.9 years ago by jaafari.omid ▴ 80

0

Entering edit mode

Please use ADD COMMENT or ADD REPLY to answer to previous reactions, as such this thread remains logically structured and easy to follow. I have now moved your post but as you can see it's not optimal. Adding an answer should only be used for providing a solution to the question asked.

ADD REPLY • link 6.9 years ago by WouterDeCoster 47k

0

Entering edit mode

Actually, no one really answered the question, but just "suggested not to take it as a big deal". I am having the same problem, but when I trimmed the sequence "GATCGGAAGAGCACACGTCTGAACTCCAGTCACACAGTGATCTCGTATGC", that appeared in overrepresented sequences and was suggested that could correspond to an adapter, the quality report just got worse, as it happened also to jaafari.omid. Also, if it´s an adapter, why the "adapter content" plot shows there´s no adapter present? I also blasted the sequence and it has a 93.5% match with Staphylococcus phage Andhra (I leave this info here in case it helps) which might mean that the sample has been contaminated with that phage. On the other hand, this sequence indeed appears in adapter catalogs (TrueSeq adapter). All of this is a bit confusing, so if someone has a well-funded explanation I would appreciate it. Thanks !

ADD REPLY • link 5.8 years ago by msimmer92 ▴ 310

0

Entering edit mode

Hi , so it should be an adaptater if it's appearing in adaptater list :) I don't really understand what you speak about when you said "the quality report just got worse" ? Best

ADD REPLY • link 5.8 years ago by Titus ▴ 910