Most sensible way to find private SNPs from a multisamples vcf with bcftools
1
0
Entering edit mode
13 months ago
Axzd ▴ 80

Hello, this question is somehow complementary to what I asked yesterday here:

Using bcftools to find unique alt homozygous sites

Now let's say I want to find the SNPs 0/1 unique to the sample D3A350g_bcftools2 (see below)

I know I can use

bcftools view -s D3A350g_bcftools2.bcf -x all_bcftools2_merged.vcf

But there is also the +contrast plugin, and this seems like it could work as well

bcftools +contrast  -a NOVELGT -0 ancestor_bcftools2.bcf,D2C150g_bcftools2.bcf,D2C1_bcftools2.bcf,D2C350g_bcftools2.bcf,D2C3_bcftools2.bcf,D3A3_bcftools2.bcf,D4B450g_bcftools2.bcf,D4B4_bcftools2.bcf,D5C150g_bcftools2.bcf,D5C1_bcftools2.bcf,D5C350g_bcftools2.bcf,D5C3_bcftools2.bcf,H2B450g_bcftools2.bcf,H2B4_bcftools2.bcf,H3A450g_bcftools2.bcf,H3A4_bcftools2.bcf,H3C450g_bcftools2.bcf,H3C4_bcftools2.bcf,H5A250g_bcftools2.bcf,H5A2_bcftools2.bcf,H5A450g_bcftools2.bcf,H5A4_bcftools2.bcf,H5C250g_bcftools2.bcf,H5C2_bcftools2.bcf -1 D3A350g_bcftools2.bcf

I can't wrap my mind around what's the correct approach. And on top of that, there is the -isec option. I realize the developers have implemented these three functions for different purposes, but they are not apparent to me. The +contrast and -x approaches give vastly different outputs. I promise you I have read the doc. I just can't find what would make the most sense for my question: "for each of my samples, I want to find its unique SNPs". I would be grateful for any pointers... honestly, I was writing a custom script, but I thought, "why remake the wheels"? But the wheel is hard to comprehend. Thanks for any help.

Example of the vcf

bcftools view -e 'GT = "./."' all_bcftools2_merged.vcf|head -n 50
##fileformat=VCFv4.2
##FILTER=<ID=PASS,Description="All filters passed">
##bcftoolsVersion=1.18+htslib-1.18
##bcftoolsCommand=mpileup -Ou -I --fasta-ref vaga.fa ../ancestor.sorted.bam
##reference=file://vaga.fa
##contig=<ID=Chrom_3,length=20354777>
##contig=<ID=Chrom_1,length=18146847>
##contig=<ID=Chrom_5,length=16930519>
##contig=<ID=Chrom_2,length=16274841>
##contig=<ID=Chrom_4,length=15224634>
##contig=<ID=Chrom_6,length=13893210>
##ALT=<ID=*,Description="Represents allele(s) other than observed.">
##INFO=<ID=INDEL,Number=0,Type=Flag,Description="Indicates that the variant is an INDEL.">
##INFO=<ID=IDV,Number=1,Type=Integer,Description="Maximum number of raw reads supporting an indel">
##INFO=<ID=IMF,Number=1,Type=Float,Description="Maximum fraction of raw reads supporting an indel">
##INFO=<ID=DP,Number=1,Type=Integer,Description="Raw read depth">
##INFO=<ID=VDB,Number=1,Type=Float,Description="Variant Distance Bias for filtering splice-site artefacts in RNA-seq data (bigger is better)",Version="3">
##INFO=<ID=RPBZ,Number=1,Type=Float,Description="Mann-Whitney U-z test of Read Position Bias (closer to 0 is better)">
##INFO=<ID=MQBZ,Number=1,Type=Float,Description="Mann-Whitney U-z test of Mapping Quality Bias (closer to 0 is better)">
##INFO=<ID=BQBZ,Number=1,Type=Float,Description="Mann-Whitney U-z test of Base Quality Bias (closer to 0 is better)">
##INFO=<ID=MQSBZ,Number=1,Type=Float,Description="Mann-Whitney U-z test of Mapping Quality vs Strand Bias (closer to 0 is better)">
##INFO=<ID=SCBZ,Number=1,Type=Float,Description="Mann-Whitney U-z test of Soft-Clip Length Bias (closer to 0 is better)">
##INFO=<ID=SGB,Number=1,Type=Float,Description="Segregation based metric, http://samtools.github.io/bcftools/rd-SegBias.pdf">
##INFO=<ID=MQ0F,Number=1,Type=Float,Description="Fraction of MQ0 reads (smaller is better)">
##FORMAT=<ID=PL,Number=G,Type=Integer,Description="List of Phred-scaled genotype likelihoods">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##INFO=<ID=AC,Number=A,Type=Integer,Description="Allele count in genotypes for each ALT allele, in the same order as listed">
##INFO=<ID=AN,Number=1,Type=Integer,Description="Total number of alleles in called genotypes">
##INFO=<ID=DP4,Number=4,Type=Integer,Description="Number of high-quality ref-forward , ref-reverse, alt-forward and alt-reverse bases">
##INFO=<ID=MQ,Number=1,Type=Integer,Description="Average mapping quality">
##bcftools_callVersion=1.18+htslib-1.18
##bcftools_callCommand=call -mv -Ob -o ancestor_bcftools.bcf; Date=Wed Sep 13 10:01:56 2023
##bcftools_viewVersion=1.18+htslib-1.18
##bcftools_viewCommand=view ancestor_bcftools.bcf; Date=Thu Sep 14 18:21:29 2023
##bcftools_viewCommand=view -Ou -o ancestor_bcftools2.bcf; Date=Thu Sep 14 18:21:29 2023
##bcftools_mergeVersion=1.18+htslib-1.18
##bcftools_mergeCommand=merge --threads 24 ancestor_bcftools2.bcf D2C150g_bcftools2.bcf D2C1_bcftools2.bcf D2C350g_bcftools2.bcf D2C3_bcftools2.bcf D3A350g_bcftools2.bcf D3A3_bcftools2.bcf D4B450g_bcftools2.bcf D4B4_bcftools2.bcf D5C150g_bcftools2.bcf D5C1_bcftools2.bcf D5C350g_bcftools2.bcf D5C3_bcftools2.bcf H2B450g_bcftools2.bcf H2B4_bcftools2.bcf H3A450g_bcftools2.bcf H3A4_bcftools2.bcf H3C450g_bcftools2.bcf H3C4_bcftools2.bcf H5A250g_bcftools2.bcf H5A2_bcftools2.bcf H5A450g_bcftools2.bcf H5A4_bcftools2.bcf H5C250g_bcftools2.bcf H5C2_bcftools2.bcf; Date=Mon Sep 25 14:45:31 2023
##bcftools_viewCommand=view -e 'GT = "./."' all_bcftools2_merged.vcf; Date=Thu Oct 12 16:23:22 2023
#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  ancestor    D2C150g D2C1    D2C350g D2C3    D3A350g D3A3    D4B450g D4B4    D5C150g D5C1    D5C350g D5C3    H2B450g H2B4    H3A450g H3A4    H3C450g H3C4    H5A250g H5A2    H5A450g H5A4    H5C250g H5C2
Chrom_3 7253    .   T   A   221.3   .   VDB=0.535543;SGB=-0.693147;RPBZ=-1.50431;MQBZ=1.34343;MQSBZ=-1.02125;BQBZ=0.323777;SCBZ=0.975289;MQ0F=0.00398406;MQ=43;DP=4623;DP4=1443,1221,494,349;AN=50;AC=25    GT:PL   0/1:136,0,255   0/1:97,0,255    0/1:175,0,255   0/1:240,0,255   0/1:255,0,255   0/1:125,0,255   0/1:255,0,255   0/1:208,0,255   0/1:230,0,255   0/1:255,0,255   0/1:255,0,255   0/1:218,0,255   0/1:255,0,255   0/1:177,0,255   0/1:255,0,202   0/1:255,0,255   0/1:241,0,255   0/1:255,0,255   0/1:249,0,255   0/1:136,0,255   0/1:220,0,255   0/1:219,0,255   0/1:255,0,255   0/1:166,0,255   0/1:172,0,255
Chrom_3 7906    .   G   A   222.406 .   VDB=1.00557e-07;SGB=-0.693147;RPBZ=-1.81995;MQBZ=-4.11869;MQSBZ=0.0750985;BQBZ=-2.55896;SCBZ=4.94681;MQ0F=0;MQ=58;DP=2581;DP4=522,513,514,371;AN=50;AC=25   GT:PL   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:67,0,255    0/1:93,0,255    0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:243,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255
Chrom_3 7911    .   C   T   222.411 .   VDB=2.66194e-07;SGB=-0.693147;RPBZ=-0.981365;MQBZ=-4.00625;MQSBZ=0.0765345;BQBZ=-3.22205;SCBZ=5.27961;MQ0F=0;MQ=58;DP=2537;DP4=501,513,511,380;AN=50;AC=25  GT:PL   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:45,0,255    0/1:40,0,255    0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:243,0,239   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255
Chrom_3 7912    .   G   A   222.405 .   VDB=5.09897e-07;SGB=-0.693147;RPBZ=-0.98308;MQBZ=-4.06079;MQSBZ=0.0765345;BQBZ=-3.32505;SCBZ=4.87485;MQ0F=0;MQ=58;DP=2539;DP4=503,515,512,378;AN=50;AC=25   GT:PL   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:59,0,255    0/1:90,0,255    0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:245,0,236   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255
Chrom_3 7949    .   G   C   222.406 .   VDB=0.201825;SGB=-0.693147;RPBZ=1.81576;MQBZ=-4.12627;MQSBZ=-0.328279;BQBZ=0.966014;SCBZ=2.96348;MQ0F=0.00518135;MQ=57;DP=2899;DP4=507,536,593,488;AN=50;AC=25  GT:PL   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:206,0,255   0/1:197,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,253   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255
Chrom_3 7972    .   A   C   222.406 .   VDB=0.0949276;SGB=-0.693147;RPBZ=-2.4423;MQBZ=-0.00764329;MQSBZ=0.425233;BQBZ=-2.16904;SCBZ=3.54574;MQ0F=0.0060241;MQ=59;DP=2334;DP4=510,545,346,326;AN=50;AC=25    GT:PL   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:76,0,255    0/1:118,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:230,0,245   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255
Chrom_3 7974    .   A   T   222.409 .   VDB=0.075728;SGB=-0.693147;RPBZ=-2.13329;MQBZ=-0.527608;MQSBZ=-0.123898;BQBZ=-1.0772;SCBZ=3.61746;MQ0F=0;MQ=59;DP=2331;DP4=511,542,346,323;AN=50;AC=25  GT:PL   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:76,0,255    0/1:120,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:223,0,245   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255
Chrom_3 7980    .   C   A   222.404 .   VDB=0.135845;SGB=-0.693147;RPBZ=-1.43568;MQBZ=-0.50627;MQSBZ=-0.15972;BQBZ=-0.961533;SCBZ=3.49738;MQ0F=0;MQ=59;DP=2335;DP4=509,541,356,318;AN=50;AC=25  GT:PL   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:86,0,255    0/1:100,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:235,0,228   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255
Chrom_3 7983    .   C   T   222.403 .   VDB=0.106311;SGB=-0.693147;RPBZ=-1.30284;MQBZ=-0.492116;MQSBZ=-0.161765;BQBZ=0.152925;SCBZ=3.94198;MQ0F=0;MQ=59;DP=2341;DP4=511,541,357,322;AN=50;AC=25 GT:PL   0/1:255,0,255   0/1:255,0,255   1/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:86,0,255    0/1:111,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:233,0,236   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255
Chrom_3 8004    .   T   G   222.406 .   VDB=0.557767;SGB=-0.693147;RPBZ=0.23356;MQBZ=-1.95341;MQSBZ=-1.52365;BQBZ=-0.360925;SCBZ=4.40809;MQ0F=0;MQ=59;DP=2386;DP4=522,519,356,344;AN=50;AC=25   GT:PL   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:98,0,255    0/1:108,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,204   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255
Chrom_3 8010    .   T   C   222.406 .   VDB=0.621464;SGB=-0.693147;RPBZ=0.764096;MQBZ=-0.452836;MQSBZ=-1.52365;BQBZ=-0.468479;SCBZ=4.32313;MQ0F=0;MQ=59;DP=2399;DP4=529,513,360,349;AN=50;AC=25 GT:PL   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:98,0,255    0/1:118,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,204   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255   0/1:255,0,255
bcftools SNP vcf • 904 views
ADD COMMENT
0
Entering edit mode
13 months ago
Axzd ▴ 80

Okay I made a test with the contrast plugin like this; the command is annoyingly long, but

bcftools view -e 'GT="./."' all_merged.vcf|bcftools +contrast -a NOVELGT -0 ancestor.sorted,D2C150g.sorted,D2C1.sorted,D2C350g.sorted,D2C3.sorted,D3A350g.sorted,D3A3.sorted,D4B450g.sorted,D4B4.sorted,D5C150g.sorted,D5C1.sorted,D5C350g.sorted,D5C3.sorted,H2B450g.sorted,H2B4.sorted,H3A450g.sorted,H3A4.sorted,H3C450g.sorted,H3C4.sorted,H5A250g.sorted,H5A2.sorted,H5A450g.sorted,H5A4.sorted,H5C250g.sorted -1 H5C2.sorted|bcftools view -i 'NOVELGT!="."' > test_contrast.vcf

Then I realised I could easily do this:

  bcftools query -f '%CHROM\t%POS\t[ %GT]\n' test_contrast.vcf                           
    Chrom_3 1649771  1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 0/1
    Chrom_3 9255797  0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 1/1
    Chrom_3 14137793     0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 1/1
    Chrom_3 14661574     1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 0/1
    Chrom_3 14661575     1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 0/1
    Chrom_3 14661576     1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 0/1
    Chrom_3 14661577     1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 0/1
    Chrom_3 14661578     1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 0/1
    Chrom_3 18290711     0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 1/1
    Chrom_3 18489859     1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 0/1
    Chrom_3 18489860     1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 0/1
    Chrom_3 18489861     1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 0/1
    Chrom_1 68425    0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/2
    Chrom_1 337562   1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 0/1
    Chrom_1 337564   1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 0/1
    Chrom_1 337566   1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 0/1
    Chrom_1 337687   1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 0/1
    Chrom_1 337688   1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 0/1
    Chrom_1 830354   0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 1/1
    Chrom_1 1710279  0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 1/1
    Chrom_1 1710296  0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 1/1
    Chrom_1 1710379  0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 1/1
    Chrom_1 1710393  0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 1/1
    Chrom_1 1710440  0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 1/1
    Chrom_1 1710673  0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 1/1
    Chrom_1 1711384  0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 1/1
    Chrom_1 1711400  0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 1/1
    Chrom_1 1711603  0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 1/1
    Chrom_1 8016726  0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 1/1
    Chrom_5 3057300  0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/2
    Chrom_5 3829125  0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 1/1
    Chrom_5 3829135  0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 1/1
    Chrom_2 203947   1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 0/1
    Chrom_2 203960   1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 0/1
    Chrom_2 3657407  0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 1/1
    Chrom_2 16145952     1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 0/1
    Chrom_2 16145954     1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 0/1
    Chrom_2 16145964     1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 1/1 0/1
    Chrom_6 10276340     0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 1/1
    Chrom_6 12343348     0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 1/1
    Chrom_6 13573971     0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/2
    Chrom_6 13574190     0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/2

If I can get a table for each sample, I can analyse and compare all that as Pandas data frames. So, I suspect I have found the way to go.

I will leave this here, hoping it can help someone or that someone has something to say.

ADD COMMENT
1
Entering edit mode

You're on to the right approach. This GT table should help you with your problem from yesterday as well. I often extract GT, AD, DP in a table of this sort because it's a lot easier to work with this 2D dataset than VCF's 3D nature.

ADD REPLY
0
Entering edit mode

Thank you so much. I really appreciate the help I get here. The idea of AD and DP is great! I will do that.

ADD REPLY

Login before adding your answer.

Traffic: 1798 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6