Entering edit mode
4.2 years ago
speycast
•
0
Hello,
I'd like to know if NA12878.NIST.2.18.b37.vcf.gz has all the gene and regions covered? if not, does anyone know where to get this truth set for HBA gene? I'm planning to use GATK picard GenotypeConcordance to compare HBA1 and HBA2 of a sample vcf to the truth set (NA12878).
Thank you
Should be the entire genome based on the name. Did you peek inside the file to see what is there?
Thank you genomax!!! Yes I did, for example: I used bcftools to view the regions of interest (HBA) when I did:
bcftools view sample.vcf.gz --regions 1:155204712-155211174
it have me the regions in the vcf, but when I filter the same region of NA12878.NIST.2.18.b37.vcf.gz, it return anything. I'm just wondering if this region is not in the reference, would the genotype concordance analysis be accurate?