Overrepresented sequences in GBS data
0
0
Entering edit mode
2.2 years ago
S ▴ 10

I used Trim Galore! on my GBS data and have ~30 overrepresented sequences in the fastqc report. Each sequence starts with the restriction enzyme's cutting site (PstI; TGCA). No sources have been identified and the percentages range from 10-77% for the R1 and R2 outputs. Is this to be expected or something I should be concerned about? Thanks. Snapshot of Overrepresented Sequences table

UPDATE: I just realized that these percentages are out of 100, not 1, so 0.77 is less than 1%. That makes these results seem more normal.

sequences genotype-by-sequencing overrepresented fastqc GBS • 757 views
ADD COMMENT
0
Entering edit mode

Are these unrecognized primers?

ADD REPLY
1
Entering edit mode

Since your method used restriction enzymes it is not unusual to see the start of the sequence be the recognition site. Looks like what you would expect based on restriction digestion.

ADD REPLY

Login before adding your answer.

Traffic: 2069 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6