Entering edit mode
6.1 years ago
rshoobs
▴
10
Hi all! I recently performed some imputation using the sanger imnputation service. While I was parsing the data I noticed that some of the SNP IDs it outputs are filled with a period '.' rather than an actual ID.
for reference the output looks somewhat like this
##header info
##header info
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT SAMPLE_ID1 ...
10 235413 rs2448371 G A . PASS RefPanelAF=0.665456;AN=154;AC=95;INFO=0.995195 GT:AD
10 235386 . G A . PASS RefPanelAF=0.000302663;AN=154;AC=0;INFO=1 GT:ADS:DS:GP ...
10 235434 rs549315358 T A . PASS RefPanelAF=0.000403551;AN=154;AC=0;INFO=1 GT:AD ...
10 235475 rs559476434 T C . PASS RefPanelAF=0.000100888;AN=154;AC=0;INFO=1 GT:AD ...
10 235494 rs528633330 T A . PASS RefPanelAF=0.000100888;AN=154;AC=0;INFO=1 GT:AD ...
10 235511 . T C . PASS RefPanelAF=0.000100888;AN=154;AC=0;INFO=1 GT:ADS:DS:GP ...
...
I just want to highlight that some of the IDs are unfilled. To fix I was just planning on assigning any empty ID with one based on CHR:POS. However I am wondering why sanger left these IDs as empty and if they signify any problems with the data/outputs. I am planing to run MatrixEQTL on this data downstream if that is relevant.