I have a vcf file with the traditional header format...
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT NA12878
When reading the output, is there a simple way to scan the data (visually) and see what sites have multiple alternate alleles? I'm also seeking a simple linux tool to call the multiple alternate alleles from the vcf file. I found this one as a possibility ( how to remove multiallelic from VCF )...
awk '/#/{print;next}{if($5 !~ /,/ && length($5)==1 && length($4)==1){print}}' file.vcf
But cannot figure out the syntax error?
awk: cmd. line:1: /#/{print;next}{if($5 !~ /,/ && length($5)==1 && length($4)==1){print}}SRR1611183.gatk.vcf
awk: cmd. line:1: ^ syntax error
The link is dead. Can you repost it? I'd like to know how to do this using regular bash commands.
I have fixed it. Can you try again, James?