FastQC quality
1
0
Entering edit mode
23 months ago
bestone ▴ 30

enter image description here

quality trimmomatic FastQC FASTQC • 1.8k views
ADD COMMENT
0
Entering edit mode

Hello, I have attached the photo you see above. I have a problem with Per sequence GC content, Overrepresented sequences, etc.. . Although I use the trimmomatic tool, I couldn't solve some of the problems here. What would you recommend to solve this problem?

ADD REPLY
1
Entering edit mode

Do the sequencing again.

ADD REPLY
1
Entering edit mode

You need to add more details. Is this pre or post trimming RNA-seq, genomic etc, and what is your goal ? It might be ok for RNA-seq but doesn't look fantastic quality for eg SNP calling, where higher quality is expected.

ADD REPLY
0
Entering edit mode

This is one of the samples where the whole genome analysis of 32 apricot varieties was made. The picture I added is before the trim. I will add the version after trimming. My goal here is to clean up the raw data and do a whole genome analysis.

ADD REPLY
0
Entering edit mode

enter image description here

ADD REPLY
1
Entering edit mode

Could these be different samples? You should post those charts you want to ask about.

Also you can see the fastqc documentation to how to interpret these charts. https://www.bioinformatics.babraham.ac.uk/projects/fastqc/Help/3%20Analysis%20Modules/

ADD REPLY
0
Entering edit mode

No, sir They are the same sample. the last picture is after the trimming.

ADD REPLY
0
Entering edit mode

How can I fix the other three Per base sequence content, Per sequence GC content, and Sequence Length Distribution shown in the picture?

ADD REPLY
0
Entering edit mode

Thank you for replying. I added the picture after the trimming.

ADD REPLY
1
Entering edit mode
23 months ago

You don't need to "fix" them, it's just a warning. Your trimmed data don't look perfect, but are pretty good ( the blue line is the median quality as far as I remember). You can proceed with this.

I used to work in a core unit and most datasets did not fulfill all quality criteria (green tick) depending on dataset (eg miRNA or amplicons had lots of sequence duplication), even though base quality was generally quite a bit better.

Do some alignments, call some SNPs against the reference genome and check the data in a genome browser. Likely the quality of your ref genome will have a much bigger impact on final results than these slightly lower data will.

ADD COMMENT
0
Entering edit mode

Thank you very much for all your answers. I'm new to whole genome analysis, is there a video or book to teach this analysis step by step?

ADD REPLY
1
Entering edit mode

There's great material here: https://training.galaxyproject.org/ but google, youtube etc will all help you a lot

ADD REPLY

Login before adding your answer.

Traffic: 2132 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6