Variant Annotation Tools For Viral Genomes
1
0
Entering edit mode
10.8 years ago

Hi everyone!

I was wondering if there are any variant annotation tools (like SeattleSeq is for whole human genome) for viral genome that are freely available and also easy to use? Can vcf formats be applied to viral genomes too?

variant vcftools • 4.6k views
ADD COMMENT
0
Entering edit mode
10.8 years ago

I think snpEff can work with any genome, and I think it even has some pre-compiled viral references:

http://snpeff.sourceforge.net/

That said, I have tried to use general amino acid substitution patterns to prioritize among a handful of non-synonymous viral variants and I found that the least likely substitution was not actually the variant causing a phenotype of interest (as determined by comparing a wider variety of strain sequences). So, I'm not sure if snpEff is really going to be the absolute best strategy.

If you are interested in HSV, I think these are valuable resources to get an idea about natural variation. Otherwise, I'm afraid you'll need to find comparable resources among your own niche of virus research. Either way, you'll have to do some extra work beyond just providing a .vcf file and getting annotation information.

http://genomics-pubs.princeton.edu/viro-genome/

http://szparalab.psu.edu/hsv-diversity/

ADD COMMENT
0
Entering edit mode

Hi there! Actually I was planning on working with HIV. But thanks for your help... I'll try working SnpEff out. Also, where do I get the HIV-1 reference genome?

ADD REPLY
0
Entering edit mode

You can look for sequences in NCBI Nucleotide. However, I urge you to not use a single genome reference for interpretation of your results. Otherwise, you run the risk that I described above: you may pick the mutant that is predicted to be the most deleterious, but there is a good chance the prediction is not precise (and comparisons across variants in multiple strains with or without your phenotype of interest is a much better way of narrowing down your options).

ADD REPLY
0
Entering edit mode

Okay... thanks for the suggestion... what if i'm just interested in HIV-1 reverse transcriptase (RT) nucleotide sequence? I want its reference and using this reference, apply variant annotation techniques to assess my sample set ... what would you suggest? Also, where do i get a reference just for this protein ? (genbank doesnt really have a reference... or maybe i'm searching it wrong )

ADD REPLY
0
Entering edit mode

NCBI contains all types of nucleotide sequences (not just genomic - also transcript, EST, cDNA, etc.). You can search "HIV-1 reverse transcriptase" to get a lot of hits, but I think you want something more specific.

You can use the same strategy for NCBI Protein (you can switch between databases using the pull-down tab). However, I think SnpEff will only work with nucleotide sequences (and I'm not sure if it is really well designed for a single gene, either way).

Perhaps something like base-by-base is better for finding variants among NCBI nucleotide entries (as opposed to Illumina short-read data, for example):

http://athena.bioc.uvic.ca/virology-ca-tools/base-by-base/

ADD REPLY

Login before adding your answer.

Traffic: 2662 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6