Entering edit mode
22 months ago
a
•
0
I have a group of samples (150 bp paired-end DNA sequencing reads) ranging in sequencing depth from ~10X to ~100X.
I understand that for the samples to be comparable, their FQs could be downsampled to uniform coverage.
But, I'd like to use all of the data per sample to call variants, rather than downsampling reads.
Is there a tool to downsample VCFs to similar coverage distributions with a desired median coverage?
If it is possible, is this a reasonable way to accomplish the same thing as downsampling reads?
Alternatively, can I filter the VCFs to render these samples comparable?
I do not know how you want to downsample vcf file,. However, to downsample reads, you can use
seqtk sample
https://github.com/lh3/seqtkDownsampling dataset with more than 60 million reads