Entering edit mode
11.8 years ago
Caddymob
★
1.0k
Does anyone know the format of how the NHLBI reports the genotypes for chromosome X varriants? If we look at the example rs6525447
we get strange stuff...
In the VCF format we see EA_GTC=31,406,170,1991,1700
Heres the whole entry with the VCF header for completeness:
##fileformat=VCFv4.0
##INFO=<ID=DBSNP,Number=.,Type=String,Description="dbSNP version="" which="" established="" the="" rs_id"="">
##INFO=<ID=EA_AC,Number=.,Type=String,Description="European American="" Allele="" Count="" in="" the="" order="" of="" AltAlleles,RefAllele.="" For="" INDELs,="" A1,="" A2,="" or="" An="" refers="" to="" the="" N-th="" alternate="" allele="" while="" R="" refers="" to="" the="" reference="" allele."="">
##INFO=<ID=AA_AC,Number=.,Type=String,Description="African American="" Allele="" Count="" in="" the="" order="" of="" AltAlleles,RefAllele.="" For="" INDELs,="" A1,="" A2,="" or="" An="" refers="" to="" the="" N-th="" alternate="" allele="" while="" R="" refers="" to="" the="" reference="" allele."="">
##INFO=<ID=TAC,Number=.,Type=String,Description="Total Allele="" Count="" in="" the="" order="" of="" AltAlleles,RefAllele="" For="" INDELs,="" A1,="" A2,="" or="" An="" refers="" to="" the="" N-th="" alternate="" allele="" while="" R="" refers="" to="" the="" reference="" allele."="">
##INFO=<ID=MAF,Number=.,Type=String,Description="Minor Allele="" Frequency="" in="" percent="" in="" the="" order="" of="" EA,AA,All"="">
##INFO=<ID=GTS,Number=.,Type=String,Description="Observed Genotypes.="" For="" INDELs,="" A1,="" A2,="" or="" An="" refers="" to="" the="" N-th="" alternate="" allele="" while="" R="" refers="" to="" the="" reference="" allele."="">
##INFO=<ID=EA_GTC,Number=.,Type=String,Description="European American="" Genotype="" Counts="" in="" the="" order="" of="" listed="" GTS"="">
##INFO=<ID=AA_GTC,Number=.,Type=String,Description="African American="" Genotype="" Counts="" in="" the="" order="" of="" listed="" GTS"="">
##INFO=<ID=GTC,Number=.,Type=String,Description="Total Genotype="" Counts="" in="" the="" order="" of="" listed="" GTS"="">
##INFO=<ID=DP,Number=1,Type=Integer,Description="Average Sample="" Read="" Depth"="">
##INFO=<ID=FG,Number=.,Type=String,Description="functionGVS">
##INFO=<ID=GM,Number=.,Type=String,Description="accession">
##INFO=<ID=AA,Number=1,Type=String,Description="chimpAllele">
##INFO=<ID=AAC,Number=.,Type=String,Description="aminoAcidChange">
##INFO=<ID=PP,Number=.,Type=String,Description="proteinPosition">
##INFO=<ID=CDP,Number=.,Type=String,Description="cDNAPosition">
##INFO=<ID=PH,Number=.,Type=String,Description="polyPhen">
##INFO=<ID=CP,Number=1,Type=Float,Description="scorePhastCons">
##INFO=<ID=CG,Number=1,Type=Float,Description="consScoreGERP">
##INFO=<ID=GL,Number=.,Type=String,Description="geneList">
##INFO=<ID=GS,Number=.,Type=String,Description="granthamScore">
##INFO=<ID=CA,Number=.,Type=String,Description="clinicalAssociation">
##INFO=<ID=EXOME_CHIP,Number=.,Type=String,Description="Whether a="" SNP="" is="" on="" the="" Illumina="" HumanExome="" Chip"="">
##FILTER=<ID=INDEL5,Description="Nearby 1000="" Genomes="" Pilot="" Indels="" within="" 5bp"="">
##FILTER=<ID=SVM,Description="Failed SVM-based="" filter="" at="" threshold="" 0.3.="" (detailed="" at="" <a="" href="<a href=" http:="" evs.gs.washington.edu="" EVS="" HelpSNPSummary.jsp#FilterStatus)"="" rel="nofollow">http://evs.gs.washington.edu/EVS/HelpSNPSummary.jsp#FilterStatus)" rel="nofollow">http://evs.gs.washington.edu/EVS/HelpSNPSummary.jsp#FilterStatus)">
##INFO=<ID=GWAS_PUBMED,Number=.,Type=String,Description="PubMed records="" for="" GWAS="" hits"="">
##QueryTarget=X:1-155270560
#CHROM POS ID REF ALT QUAL FILTER INFO
X 70146475 rs6525447 G C . PASS DBSNP=dbSNP_116;EA_AC=638,6088;AA_AC=2165,1669;TAC=2803,7757;MAF=9.4856,43.5316,26.5436;GTS=CC,CG,C,GG,G;EA_GTC=31,406,170,1991,1700;AA_GTC=533,772,327,327,243;GTC=564,1178,497,2318,1943;DP=40;GL=SLC7A3;CP=1.0;CG=4.2;AA=C;CA=.;EXOME_CHIP=yes;GWAS_PUBMED=.;GM=NM_001048164.2,NM_032803.5;FG=missense,missense;AAC=VAL/LEU,VAL/LEU;PP=508/620,508/620;CDP=1522,1522;GS=32,32;PH=benign,benign
Know how to decode? The VCF header doesn't tell me anything and I don't want to assume anything...
Thanks!
Might want to explain precisely, for the uninitiated, what the "strange stuff" is.
Oh, well, I'm supposing its male/female. For something on a non X-chr, you get like EA_GTC=6,344,2816 -- three numbers of genotypes. So why 5 on X Chr SNPs?