Question

Convert chromosome headings in VCF file from Trinity format to chr:pos format

0

Entering edit mode

7.3 years ago

TrentGenomics ▴ 30

Hello,

I have called variants using a pipeline consisting of samtools mpileup, bcftools call and bcftools filter to obtain a VCF file containing SNPs and short INDELS.

I would like to annotate the SNPs and INDELS in my VCF file to predict the effect of function of SNPs. From my understanding, most programs require that the variant headers in the VCF file have chromosome names that match the annotation file or database.

I'm working with SNPs and INDELS called from a de novo transcriptome assembled by Trinity, therefore the variant calls in my VCF file look like this:

TRINITY_DN165715_c0_g1_i1   349 .   A   G   91  PASS    DP=11;VDB=0.746774;SGB=-0.676189;MQSB=0.0297172;MQ0F=0;AC=2;AN=2;DP4=0,0,6,5;MQ=17  GT:PL   1/1:121,33,0

Is there a script that I can use to reformat my variant headers to a more generic format used by variant annotation programs?

Any info would be greatly appreciated.

Cheers,

Mike

snp • 2.0k views

ADD COMMENT • link updated 7.3 years ago by Pierre Lindenbaum 164k • written 7.3 years ago by TrentGenomics ▴ 30

0

Entering edit mode

Do you know how to convert from your custom transcript locations to genome locations? What organism are you working with?

ADD REPLY • link 7.3 years ago by Sean Davis 27k

0

Entering edit mode

Thanks for getting back to me, Sean. At the moment, no, I don't know how to convert from my custom transcript locations to genome locations. Would that be the first step in getting my variant headers reformatted? I'm working with flying squirrels; calling SNPs between two species.

Mike

ADD REPLY • link 7.3 years ago by TrentGenomics ▴ 30

0

Entering edit mode

Hi Sean,

Are you familiar with a reliable program that is able to convert transcript locations to genome locations?

Mike

ADD REPLY • link 7.3 years ago by TrentGenomics ▴ 30

score 0 · Answer 1 · 2017-07-25

0

Entering edit mode

7.3 years ago

Pierre Lindenbaum 164k

Is there a script that I can use to reformat my variant headers to a more generic format used by variant annotation programs?

I've written something like this: http://lindenb.github.io/jvarkit/ConvertVcfChromosomes.html

ADD COMMENT • link 7.3 years ago by Pierre Lindenbaum 164k

0

Entering edit mode

Thank you, Pierre, I will look into your program.

ADD REPLY • link 7.3 years ago by TrentGenomics ▴ 30

0

Entering edit mode

Could you please provide more direction for how I should structure the 'vcfrenamechr' command in terms of how my variant headers are currently structured in a Trinity contig format? I would like to be able to annotate my variants using snpEff.

Thanks again,

Mike

ADD REPLY • link 7.3 years ago by TrentGenomics ▴ 30