Hello all,
I am trying to establish some quality controls for any FASTQs my labs get using FASTQC. Typically, we get 8 FASTQs (paired end) and I run this quality control analysis for major characteristics. I get an output for each of the 8 FASTQs; do I use 1 of the 8 FASTQs to prove the quality? Using all 8 seems excessive, and other bioinformatician reports don't seem to have all of them (I feel like I should use all 8).
Secondly, I am trying to find the main characteristics of proving this raw data is high quality; are there guidelines for this? So far, I know to measure the mapped reads (>92%), 80% of the reads being Q30, GC content, and maybe average sequencing depth on target. Are there any other main characteristics to look for?
Thanks
Hello,
I will check out multi-qc, seems like exactly what I need. And yes, as I start to create pipelines for my company (they did not previously have a bioinformatics department) it seems I will have to start splitting up scripts for different experiments. Thank you for your reply.
I would start with the multi-qc in any case. It can also included info from read-mappers and many other NGS post-sequencing tools. Presenting this to 'users' will in most cases answer many of their questions and can as well be used to check data quality in general
(do check the legal info if you want to have this in a corporate setting)