Strange overrepresented sequence
1
0
Entering edit mode
1 day ago
georomano • 0

Hi everyone, I was recently performing a fastQC after adapter timing with TrimGalore, and found a strange overrepresented sequence in read pair 2 for most of my samples: 'GTAAAAGGTAGCAATAGCTTTAAGCCAAGAAATTGTTCTCAGAAATGGCT'

Overrepresented sequence in fastQC analysis

Has anyone come across it before and knows what it relates to?

Background info: I used the following parameters for trim_galore: ' --cores 4 --trim-n --length 36 --paired --retain_unpaired '

Thank you!

fastQC adapter rnaseq • 229 views
ADD COMMENT
1
Entering edit mode

Hi, could you please provide further information on the type of data your reads originate from? Is it RNA-Seq or DNA-Seq - which species/genus does it originate from?

I just blasted it and found a Nicotiana attenuata cloroplast-based predicted protein. Would that fir your data?

ADD REPLY
0
Entering edit mode

It would, thanks for your input!

ADD REPLY
3
Entering edit mode
1 day ago
GenoMax 150k

It is 0.1% of the total sequence. Nothing you likely need to worry about at this point.

ADD COMMENT
2
Entering edit mode

Seconding this. Please don't overinterpret these fastqc results. It's really just a crude QC check. Go along with your analysis.

ADD REPLY
0
Entering edit mode

I did see that one. I do wonder why these seqs would be over-represented in R2 and not R1... anyhow, as all of you mention, I will go ahead, now that I know where they come from. I thought they were some sort of weird adapter derived sequence but they actually come from the chloroplast...

ADD REPLY

Login before adding your answer.

Traffic: 2385 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6