Hello everyone,
I have annovar result as a.hg19_multianno.txt
and I want to filter following variants. these variants are found in 24 column. My following awk code is not working.
awk -F "\t" '{
if (($24=="disruptive_inframe_deletion" || $24=="disruptive_inframe_insertion" || $24=="exon_loss_variant" || $24=="frameshift_variant" || $24=="frameshift_variant+start_lost" || $24=="frameshift_variant+stop_gained" || $24=="frameshift_variant+stop_lost" || $24=="inframe_deletion" || $24=="inframe_insertion" || $24=="initiator_codon_variant" || $24=="missense_variant" || $24=="splice_acceptor_variant" || $24=="splice_donor_variant" || $24=="splice_region_variant" || $24=="start_lost" || $24=="start_lost+inframe_deletion" || $24=="stop_gained" || $24=="stop_gained+disruptive_inframe_deletion" || $24=="stop_gained+disruptive_inframe_insertion" || $24=="stop_gained+inframe_insertion" || $24=="stop_lost" || $24=="stop_lost+disruptive_inframe_deletion" || $24=="stop_lost+inframe_deletion" || $24=="stop_retained_variant" || $24=="TF_binding_site_variant"))
print
}'
Is there anyone that can help me?
Thanks!
Can you paste a line from the text file containing one of these variants?
It cannot work
Paste a line of your input file where you're having problems, please.
Please, edit your question and add an example input line in a correct format to see it in a clear way. Also, it is appreciated if you try to explain the problem. For example:
In this way the people can help you faster and probably they would suggest you the correct solution, and not another things due to a misunderstanding.
So, in your example, do you expect to have that line in the output because there is a match? Which one is the match?
This line is only first line of my VCF. Other lines I know include interesed variants
Are you sure about
$24
?I have solved my problem with using following command from Jorge Amigo, also michael.ante's command can work too.
I am the beginner of the awk code, so I have many problems :)
My next question is filtering Exac values less and equal than 0.02 and including unknown variants ".". I have written a code as ;
It does not work. How I can manipulate this?
when you have a new question it's better to open a new one. if you want to continue asking about the same things I would either edit or comment (I indeed moved this new question to this comment section) your original question. I see you've already done so on awk code for Exac MAF values, so it would be wise to edit or delete this comment.
PS: the answer is
awk '$6<0.02' a.txt