Individual vs. joint call VCFs
1
0
Entering edit mode
12 months ago
Sd • 0

Is there any way to figure out and be sure if a VCF file is individually called or jointly called? Is there any line in the VCF header to look at for this?

GATK VCF WGS • 1.7k views
ADD COMMENT
0
Entering edit mode

I am no expert on VCF files but wouldn't a VCF file with multiple sample names in the header column tell you that it's a joint call?

ADD REPLY
0
Entering edit mode

if they used bcftools to merge a bunch of gvcfs then it wouldn't be a joint genotyping in the same way GATK performs it, which leverages quality information from many samples to infer artefactual variants. Either way there should be a line in the header.

ADD REPLY
0
Entering edit mode

I have GenotypeGVCFs command lines in the header of each VCFs but I am not sure whether they used bcftools or GATK tools (either CombineGVCFs or GenomicsDBImport).

ADD REPLY
3
Entering edit mode

GenotypeGVCFs would indicate GATK joint genotyping was used. mystery solved.

ADD REPLY
0
Entering edit mode

I'd suggest that if you see variants that fail filters in some samples, due to very low (but nonzero) allele frequency, the data was likely jointly called. Unless it's a gVCF.

ADD REPLY
2
Entering edit mode
12 months ago
raphael.B ▴ 520

You may find a line in the VCF header telling you how it has been produced. If you don't , checking the genotypes might be good start. You should have a lot of no call ( and in theory no homozygous reference call) with missing classic information (like DP) that were produced during the merge.

ADD COMMENT
0
Entering edit mode

This is what I have for GenotypeGVCFs commands in the VCFs header:

 \ ##GATKCommandLine=<ID=GenotypeGVCFs,Version=4-3.3.0-0,Date="Thu Aug 11 00:03:52 GMT 2018",Epoch=1524096052515,CommandLineOptions="analysis_type=GenotypeGVCFs input_file=[] showFullBamList=false read_buffer_size=null phone_home=NO_ET gatk_key=null tag=NA read_filter=[] intervals=null excludeIntervals=null interval_set_rule=UNION interval_merging=ALL interval_padding=0 reference_sequence=/data/reference/Homo_sap/hg38/Sequence/genome_hg38.fa nonDeterministicRandomSeed=false disableDithering=false maxRuntime=-1 maxRuntimeUnits=MINUTES downsampling_type=BY_SAMPLE downsample_to_fraction=null downsample_to_coverage=1000 baq=OFF baqGapOpenPenalty=40.0 refactor_NDN_cigar_string=false fix_misencoded_quality_scores=false allow_potentially_misencoded_quality_scores=false useOriginalQualities=false defaultBaseQualities=-1 performanceLog=null BQSR=null quantize_quals=0 disable_indel_quals=false emit_original_quals=false preserve_qscores_less_than=6 globalQScorePrior=-1.0 validation_strictness=SILENT remove_program_records=false keep_program_records=false sample_rename_mapping_file=null unsafe=ALL disable_auto_index_creation_and_locking_when_reading_rods=false no_cmdline_in_header=false sites_only=false never_trim_vcf_format_field=false bcf=false bam_compression=null simplifyBAM=false disable_bam_indexing=false generate_md5=false num_threads=1 num_cpu_threads_per_data_thread=1 num_io_threads=0 monitorThreadEfficiency=false num_bam_file_handles=null read_group_black_list=null pedigree=[] pedigreeString=[] pedigreeValidationType=STRICT allow_intervals_with_unindexed_bam=false generateShadowBCF=false variant_index_type=DYNAMIC_SEEK variant_index_parameter=-1 logging_level=INFO log_to_file=null help=false version=false variant=[(RodBindingCollection [(RodBinding name=variant source=1001.PE.ra.md.g.vcf.gz)])] out=org.broadinstitute.gatk.engine.io.stubs.VariantContextWriterStub includeNonVariantSites=false annotateNDA=false heterozygosity=0.001 indel_heterozygosity=1.25E-4 standard_min_confidence_threshold_for_calling=30.0 standard_min_confidence_threshold_for_emitting=30.0 max_alternate_alleles=6 input_prior=[] sample_ploidy=2 annotation=[InbreedingCoeff, FisherStrand, QualByDepth, ChromosomeCounts, GenotypeSummaries, StrandOddsRatio] dbsnp=(RodBinding name= source=UNBOUND) filter_reads_with_N_cigar=false filter_mismatching_base_and_quals=false filter_bases_not_stored=false">
ADD REPLY

Login before adding your answer.

Traffic: 2350 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6