FASTQC Per base sequence content failed WES
1
0
Entering edit mode
23 months ago
tanbiswas6 ▴ 10

Hi

I am doing WES data analysis and it failed at per base sequence content. I has some sequence duplication also. Below is a snapshot of my data.

enter image description here

Please let me know how to process this file.

Thank you.

DNA-seq QC WES FASTQC • 2.3k views
ADD COMMENT
0
Entering edit mode

You have no adenin at your 4th read across all sequences. In general the first 7 to 8 reads are bad. If thats an option for you, just omit them or ignore them

ADD REPLY
0
Entering edit mode

Thanks for the suggestion. Can you please suggest how to remove the first 7-8 reads without disturbing any other reads in the file?

Thanks.

ADD REPLY
0
Entering edit mode

There is likely no need to do any processing at this point. If there is a problem located with the data in downstream analysis then you can come back and dig into this more. FastQC limits are designed for plain genomic sequencing. Depending on kind of experiment there may be "failures" on one or more tests. This does not automatically mean that the data has a problem or is bad.

While it is a bit odd to have majority T's at cycle 4 the data may still be fine.

ADD REPLY
0
Entering edit mode

Yes. That's where my concern is. I know that other reads are fine but if I use this file without removing those reads will not e there some problem while data analysis or publishing?

ADD REPLY
1
Entering edit mode

Most likely not, but if you want to be absolutely safe you can trim away the first 7 bases of all reads, tools like seqtk can do that.

ADD REPLY
0
Entering edit mode

I know that other reads are fine

How do you know that. Since FastQC sub-samples your data (it does not look at every read in your file) you at least have enough reads with that pattern in sample it takes.

You can use bbduk.sh from BBMap suite to trim the first 7-8 bases like so

reformat.sh -Xmx2g in=your.fastq.gz out=trimmed.fastq.gz forcetrimleft=7
ADD REPLY
1
Entering edit mode
23 months ago
Prash ▴ 280

As it is a WES, you are less likely to get deleterosity/pathogenic variants at this, assuming that your sequences are tagged with adpaters ( not trimmed). You can use fastp to trim your adapters automatically else, please find the chemistry from yoru service provider to trim the reads and go ahead. - Prash

ADD COMMENT

Login before adding your answer.

Traffic: 1800 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6