Question

Get Flanking Amino Acid Sequence

1

Entering edit mode

6.0 years ago

windsur ▴ 20

Dear all,

I've just perform an exome-seq and I've obtained the vcf file. Now to continue with my experiment, I need to extract the flanking regions wt and mut type of my dataset because I need to synthesize that for an immunotherapy research. I mean, in my vfc file I have a column like this:

AAChange.refGene

A2M:NM_000014:exon30:c.C3797A:p.A1266E
ABCC12:NM_033226:exon12:c.G1738T:p.G580C
ABL1:NM_005157:exon11:c.C2972T:p.A991V,ABL1:NM_007313:exon11:c.C3029T:p.A1010V

And the desire output is like this:

Wt Epitope                  Mut Epitope
TVVALHALSKYGAATFTRTGKAAQV   TVVALHALSKYGEATFTRTGKAAQV
DHQRYQHTVRVCGLQKDLSNLPYGD   DHQRYQHTVRVCCLQKDLSNLPYGD
APVPSTLPSASSALAGDQPSSTAFI   APVPSTLPSASSVLAGDQPSSTAFI

In case I've more than one transcritp, I'll need the first one. I know how to obtain the the flanking regions of nucleotides, but I had not find anything similar like a refGene.txt of amino acids. I've used hg19 as genome reference.

Any help is welcome!

SNP DNA-seq sequence R python • 1.3k views

ADD COMMENT • link 6.0 years ago by windsur ▴ 20

0

Entering edit mode

Thank you Chris! But unfortunately I do not have a lot of time to learn how to use pVACtools, because I will need to use another format of my vcf file... I think there is another way faster to do what I need. Because if we have to amino acid position (e.g. G580C), with a script similar of bedtools I could get the flanking position. if anyone can help I will be very happy :)

ADD REPLY • link 6.0 years ago by windsur ▴ 20

score 1 · Answer 1 · 2018-12-03

1

Entering edit mode

6.0 years ago

Chris Miller 22k

Highly suggest that you check out the pVACtools suite, which utilizes some VEP plugins and custom parsing to extract exactly this information and format it nicely prior to doing binding affinity predictions.

ADD COMMENT • link 6.0 years ago by Chris Miller 22k