Entering edit mode
6.3 years ago
elhlalisoufiane
▴
30
HI everyone ; can someone tell me where to find known indels.vcf and dbsnp.vcf for the GRCh38 reference genome Build thank's
So, from the ftp link you provide, which vcf file should be used when using BaseRecalibrator from GATK in order to skip over known polymorphic sites? Looks like 00-All.vcf.gz would be the most thorough, but it is 15 GB. Thanks!
To add to yours and Agata's answers (+1), indels can be extracted with
bcftools view -v indels mysnps.vcf.gz
, see bcftools. (I would resist the temptation of parsing vcf as text using per/python/awk scripts.)