My goal was to find the variants related to a specific gene in a VCF file, through bash in a terminal shell. The wonderful BioStars community helped me understand that in a previous discussion.
For example, to detect the variants of the ARIH1 gene in the inputFile.vcf.gz file, I can run the following commands:
ARIH1_genomic_coordinates="15:72474326-72602985"
bgzip -c inputFile.vcf > inputFile.vcf.gz
tabix input.vcf.gz
bcftools view inputFile.vcf.gz $ARIH1_genomic_coordinates
Now I have a new task: among the selected output variants, I have to select the non-synonymous variants. How can I do that on a shell terminal?
I thought that one of the VCF file variant fields contained that information, but I cannot find it.
The fields are:
CHROM POS ID REF ALT QUAL FILTER INFO FORMAT SQT_743_100
Thanks!
search this site for snpEff and snpSift, VEP ...