Hey everyone
I need some help interpreting the sequencing data I have!
I generated a ddRADseq sequencing library for several samples and sent it for paired-end sequencing. When I got the sequencing data back I then used fastQC to generate a report for each sample (over 200) and then compiled everything with MultiQC.
While inspecting the data I came across the following GC plot, which concerned me a bit as there is a clear shift in the GC distribution. Closer inspection revealed that all the flagged reads (yellow) were of the reverse reads, while the forward reads for the same sample were fine.
I'm wondering if this is something to be concerned about and if trimming is necessary, or whether this is due to the nature of ddRADseq data (high read duplication)
Please let me know what you guys think! Any guidance is appreciated.
Lemonhope
I would say go on with the analysis. Come back and investigate further if you do find something odd with the results. The shift could easily be because of the enzyme cut sites or something like that.