I am interested to petform splicing QTL analysis (sQTL). In my vcf files at some reference positions, I have more than one allele, should I need to keep them or remove rows containing those snps? For example position 187 and position 194 contains more than one allele so should I need to remove these rows?
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT 108 139
1 73 . C A . PASS . GT 0 0
1 83 . T C,A . PASS . GT 1 1
1 187 . TG T . PASS . GT 1 1
1 188 . G T . PASS . GT 0 0
1 189 . T C,G . PASS . GT 0 0
1 190 . G A . PASS . GT 0 0
1 194 . ATT A . PASS . GT 1 1
1 209 . C T . PASS . GT 0 0
I don't see anywhere that you have multiple REF alleles. There are multi-allelic sites (with multiple ALT alleles), sure, but no multiple REF alleles. Maybe you're looking at the wrong column header? Here's your data formatted for eyeballing:
Indeed this is a new thing for me, I have again checked the original file and it contains multiple reference alleles, I have downloaded the vcf file from here.
Can you please paste a few sample lines? Use this line of code to get the sample records: