Annotation of vcf with snpEff
1
0
Entering edit mode
2.3 years ago
Peerzada • 0

Hello Everyone,

I have a multi sample vcf file of a particular gene for 1000 individuals . I annotated it with snpeFF. I want to know that what amino acid and codon changes these variants are creating in each individuals . Also I want to know the position of amino acid change in the protein formed from these variants .In snpEff output summary file I am getting the overall codon and amino acid changes but I want to know the position of change and individual also. In short I want my annotation results be like that they show the position of amino acid and codon change in each individual by these variants . Kindly tell about some approach or tool for the same.

Thank you

vcf 1000 genome annotation • 1.6k views
ADD COMMENT
1
Entering edit mode
2.3 years ago

The VCF output of SnpEff already contains everything you want. https://pcingola.github.io/SnpEff/se_inputoutput/#ann-field-vcf-output-files

Allele (or ALT): In case of multiple ALT fields, this helps to identify which ALT we are referring to.

Protein_position / Protein_len: Position and number of AA (one based, including START, but not STOP).

you just to loop over each genotype to check if a sample has the ALT allele.

ADD COMMENT
0
Entering edit mode

As I am new to all this , Can you tell me how can I proceed further as I have output snpEff vcf.ann file for 1000 individuals and I am keen to get the variation in the protein from variants .I got your answer but I am not able to get the information present in the info column of annotation file .Basically I have variants of aqp1 gene in 1000 individuals and I have to find the variation in the protein that will be formed in each individual.

ADD REPLY
0
Entering edit mode

define

I have to find the variation in the protein that will be formed in each individual.

what do you need for your output ?

ADD REPLY
0
Entering edit mode

I need the protein sequence with the changed amino acid due to variants so that I can see where in the protein sequence we have the variation .

ADD REPLY
0
Entering edit mode

that's still not clear. You already have the position of the new amino acid what do you need more to "see" where in the protein sequence we have the variation

ADD REPLY
0
Entering edit mode

Actualy there are also the samples which do not contain variant and have the reference allele in homozygous form. I only need the samples with variation and the position of variation. How can I fIlter that from this data .

ADD REPLY
0
Entering edit mode

I got the Warning that "WARNING_TRANSCRIPT_NO_START_CODON" and "INFO_REALIGN_3_PRIME" .That is why my results from snpeFF are not very clear.What is the probable issue in this .

ADD REPLY

Login before adding your answer.

Traffic: 2028 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6