Hi,
I'm currently performing RNA-seq on bacterial samples (pseudomonas aeruginosa) and I'm getting the below per sequence GC content curve for around 10 of 30 samples. Essentially the (not shown here) other samples all gravitate around 65%, but there is a bump that gets bigger and bigger for different samples around the 55% mark and I'm struggling to determine what it is. The samples:
- Are of great quality (also pictured below)
- Have been trimmed with fastp, have a minimal length of 75bp and no apparent adapter contamination (according to fastQC anyway)
- Have a higher sequence duplication level above ~500
- All samples, including those with questionable GC content curve are mapping to the pseudomonas aeruginosa reference (self-assembly - by the group I'm doing this work for) at >=97%.
Does anyone know what it could be?
Good samples GC%
Bad samples GC%
Good samples sequence dup
Bad samples sequence dup
Good samples fastp GC%
Bad samples fastp GC%
Good samples quality
Bad samples quality
Thanks,
Matt
Thats a very convincing answer, thanks very much!