Entering edit mode
5.3 years ago
LimMo
▴
30
Hi all,
I'm wondering if I could check if a VCF file is derived from that BAM file.
I know how to generate the VCF file from the BAM file through different pipelines but I'm just asking if there is/are such a way to check/just to be sure especially if I have many VCF files and BAM files and I want to check which each VCF file belong to which BAM file.
I appreciate your help in advance.
I would load both files into a genome viewer such as IGV and check for a couple of variants if the REF and ALT nucleotides are supported by the reads in the BAM.
Maybe but it's not a generic way, it's better to have another command-based way
Yeah I know. The thing is that one would need to check if the REF and ALT depth/coverage is identical between the BAM and the VCF but the DPs in the VCF depent on how the variant caller calculates this and which MAPQ and basequal. theshold it uses. It is therefore not trivial to extract these info from the BAM without knowing the exact criteria from the variant caller which are not always easily-accessable so one would need to check the source code and then put this into a custom script. I pinged two expert guys on Twitter who are very much into these kind of things ( Pierre Lindenbaum, brentp ), maybe they have something.