Entering edit mode
11.5 years ago
CrazyB
▴
280
Need some help to understand VCF files (yes, I've read the info from 1kgenome and have some "basic" understanding of them).
In the genotype result, for example, I have the following SNP identified.
chr1 860461 G A 98 PASS GT:CQ:DP 1/1:98:4 ./.:98:5 ./.:98:5 0/0:.:.
The genotype for the 4 individuals are
AA, ?, ?, GG
My questions are
- why with read depth (DP) = 4 for individual #1, the genotype is "readable" and considered AA, whereas for individual #2, the read depth is 5, but the genotype cannot be called and hence ./.
- why for individual #4, nothing is readable but still there is this predicted GG genotype given.
- how did this SNP end up being given a "PASS" by the filter? To me, all 4 individuals have poor read at this position.
Any help? Great many thanks
Which variant caller produced this VCF? It might be helpful to view the output from bam-readcount for this position for all four of your bam files to understand exactly what reads support which bases and what the quality of those bases are.
Thanks a lot for the feedback. I will follow your lead and ask my co-worker for the info. Have to apologize though for not being familiar with the jargon. This VCF came out of a medical center's genomics core facility and I believe they used the "standard" GATK from Broad for sequencing analysis (and made the call). Is this what you were asking?