Hi everyone,
My lab is developing tools to identify genomic regions under selection in pooled data. Obviously we want to support many input formats. What format is your pooled data in?
Hi everyone,
My lab is developing tools to identify genomic regions under selection in pooled data. Obviously we want to support many input formats. What format is your pooled data in?
I've used FreeBayes to call variants in pooled data (20 pools of 10) in the past. The input data are in BAM format (each pool labelled using RG tag) and the FreeBayes calls are in VCF. FreeBayes uses a custom genotype string to describe the alleles in each pool. Each pool has 10 individuals and thus 20 chromosomes. The GT string reflects the presence (1) or absence (0) of the alternate allele on each of the 20 chromosomes. Is this what you are after?
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Genotype Quality, the Phred-scaled marginal (or unconditional) probability of the called genotype">
##FORMAT=<ID=GL,Number=1,Type=Float,Description="Genotype Likelihood, log-scaled likeilhood of the data given the called genotype">
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Read Depth">
##FORMAT=<ID=RA,Number=1,Type=Integer,Description="Reference allele observations">
##FORMAT=<ID=AA,Number=1,Type=Integer,Description="Alternate allele observations">
##FORMAT=<ID=SR,Number=1,Type=Integer,Description="Number of reference observations by strand, delimited by |: [forward]|[reverse]">
##FORMAT=<ID=SA,Number=1,Type=Integer,Description="Number of alternate observations by strand, delimited by |: [forward]|[reverse]">
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT pool1-63 pool2-43 pool2-63 pool3-63 pool4-63 pool5-63 pool6-63 pool7-63
chr1 113735151 . T G 552.93 . NS=8;DP=281;AC=8;AN=160;AF=0.05;RA=258;AA=18;SRF=257;SRR=1;SAF=18;SAR=0;SRB=0.99612;SAB=1;SRP=554.6;SAP=42.097;ABR=228;ABA=18;AB=0.912;ABP=400.79;RUN=5;MQM=59.444;BPL=848;BPR=268;RPL=16;RPR=2;RPP=26.655;LRB=0.51971;BVAR;SNP;TV GT:GL:DP:RA:AA:SR:SA 1/1/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0:-9.3854:64:57:6:57|0:6|0 0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0:0:10:10:0:10|0:0|0 1/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0:-1.0058:15:14:1:14|0:1|0 1/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0:-1.6621:41:38:3:38|0:3|0 1/1/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0:-9.0539:28:24:3:24|0:3|0 1/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0:-1.5989:44:41:3:41|0:3|0 1/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0:-15.859:58:54:2:53|1:2|0 0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0/0:0:20:20:0:20|0:0|0
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Would you be willing to share a line and the header? This is what I am after.
Updated post with a snip of the header and the first variant line.