Wobbly average GC profile spanning entire R1 read length; what the cause and is it bad
0
0
Entering edit mode
3.4 years ago
William ★ 5.3k

Wobbly average GC profile spanning entire read length; what the cause and is it bad.

See this multi-qc summary plot of average GC content in many samples. FASTP was used to get the average GC content per sample.

What you would expect is a straight line of average GC content over all bases of the read.

Also interesting is that the R1 GC profile stays wobbly after FASTP (adapter and low quality) trimming. And the R2 GC profile is higher (as is R1) but is stable.

Each line in the plot represents one sample (of many hundred) and is based on the average GC content in the many millions of sequencing reads of that sample.

R1 before filtering Wobbly R1 average GC profile before filtering

R1 after filtering Wobbly R1 average GC profile after filtering

R2 before filtering Higher but stable R2 average GC profile after filtering

R2 after filtering Higher but stable R2 average GC profile after filtering

This QC fail page mention that GC bias at beginning or end of read is not a big deal, those parts are just clipped during alignment.

https://sequencing.qcfail.com/articles/positional-sequence-bias-in-random-primed-libraries/

I could not find a QC fail page about this kind of GC bias. Is this also because of adapters? But that would then require an even distribution of reads between 0 and 150bp, to get this pattern over the entire read. Or is the reasons something went wrong with the sequencing (chemistry) , or maybe contamination?

In general I know that alignment and variant calling against a reference genome is tolerant for bad input sequencing data. Bad reads get trimmed during alignment and or skipped during variant calling.

So I am wondering is this data can still be used for downstream analysis. I am also looking into downstream QC metrics for the samples with the wobbly GC profile (i.e. percentage mapped, coverage, percentage proper pairs etc).

fastq qc • 987 views
ADD COMMENT
0
Entering edit mode

Is this also because of adapters?

Isn't fastp removing them? Are you plotting billions of reads (multiple samples) in this profile or is this just one sample?

ADD REPLY
0
Entering edit mode

See updated post opening.

ADD REPLY

Login before adding your answer.

Traffic: 1829 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6