What Are The Metrics To Determine The Quality Of A Whole Genome Sequence
1
6
Entering edit mode
13.5 years ago
Biomed 5.0k

Hi, I would like to generate a set of metrics to be able to evaluate the general quality of a whole genome sequence before I can start analyzing it with a reasonable confidence that the variation I am after is in the haystack. I know there are tools like fastqc that generate reports but without knowing pretty well what you should expect the tools are less effective. I know there is not a single criteria and everyone has their own list of things but I think there are some common criteria that most people could aggree on.

example GC% should be between 40-50 or total number of reads should be >3 million etc. Thanks

genome next-gen sequencing quality • 3.0k views
ADD COMMENT
1
Entering edit mode
13.5 years ago
Pablo ★ 1.9k

At FastQC's page http://www.bioinformatics.bbsrc.ac.uk/projects/fastqc/

You'll find that there are a couple of examples of a 'good' and a 'bad' quality runs.

You are right that there are no well defined thresholds for saying when a run has gone 'bad'. I think it the answer here is that it very dependent on what kind of analysis you planning to do downstream.

Edit: The main criteria I use is that if the quality plot goes below 25 very fast, then it's time to start trimming (or re-sequencing). The GC criteria doesn't apply always (e.g. some plasmodiums are very AT rich and for sure doesn't apply for bisulfite sequencing).

ADD COMMENT

Login before adding your answer.

Traffic: 2480 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6