Entering edit mode
6.5 years ago
BioinfGuru
★
2.1k
Hi all,
Please take a look at these sequence quality histograms from fastqc.
Sample 1:
Sample 2:
This is WGS data sequenced on illumina HISEQ4000. We intend to call snps and indels and possibly structural variants. In the future we may even use the data set for imputation.
I have 4 options and I'm not really experienced enough to make the call but I'd like some informed opinions
- Perform another size selection step to narrow the spread in the library pool so the HiSeq4000 can accommodate without read2 quality dropping as it did in the first run. We have QC’ed the library following a second round of sizing and it does look much better in terms of suitability for the HiSeq4000. However, 10X do not recommend this due to the fear of losing diversity in the library.
- Run the library again on the HiSeq4000 with adjusted loading to improve overall yield. The likelihood here is that the read2 issue will continue.
- Run the library on the NextSeq500. This is an unknown but it is believed this could accommodate the size of the library better than the HiSeq4000. The data yield would be lower.
- Just use the data as is - the sequencing quality is still quite good - maybe consider trimming, but how much should be trimmed?
Appreciate any impute from experienced eyes.
EDIT: I'm expecting around 20X coverage (150bp read length, paired end, 250M reads per sample (125M per fq), 3GB genome)
Thanks, Kenneth.
quality plots look according to expectations to me. I would be tempted to go ahead with it as it is
exactly what i was thinking - it could be better, but it really isn't that bad.
Before thinking of spending more money/effort (sequencing) you should always go ahead and analyze data you have in hand. As others have said this is not looking too shabby.
Yes, I've started the pipeline for the data, should have the bams tomorrow and the vcfs on friday. thank you.
How to add images to a Biostars post
Edited ... thank you
Sequencing depth will be another important consideration to do the things you mentioned (especially SNPs). If you have a lot of depth, then I think the quality looks fine.