Question

assign each SNP a strand information

2

Entering edit mode

9.6 years ago

tonja.r ▴ 600

I have numbers of SNPs obtained from a paper where authors specified old and novel detected SNPs (index SNPs) with the position, gene, MAF, risk and normal allele and nothing more (GWAS).

I am analyzing a linkage disequilibrium want to find a set of credible SNPs using the index SNPs mentioned in the paper. After I identify the set (which can also include index SNPs) I want to annotate them using VEP but I do not have a strand information for index SNPs. Is there any command-line approach that I can use to assign each index SNP a strand information based on position, MAF, alleles?

(I would not like to use ANNOVAR as I further want to do a custom annotation with BED file. And annovar output is very weak, missing the peak information)

SNP • 3.0k views

ADD COMMENT • link updated 2.5 years ago by Ram 44k • written 9.6 years ago by tonja.r ▴ 600

Ram · Answer 1 · 2015-03-13

0

Entering edit mode

9.6 years ago

Vivek ★ 2.7k

If you have the BAM file used to call the SNPs, read backed phasing module in GATK is one way to go. You'll likely have to write additional scripts to generate haplotypes from the VCF produced by the module:

https://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_gatk_tools_walkers_phasing_ReadBackedPhasing.php

ADD COMMENT • link updated 2.5 years ago by Ram 44k • written 9.6 years ago by Vivek ★ 2.7k

0

Entering edit mode

I guess I was not clear enough. Authors of the paper where I took the SNPs from, provided only following information: SNP, gene, position, maf, risk allele, normal allele and nothing more. I have only this information.

ADD REPLY • link 9.6 years ago by tonja.r ▴ 600

0

Entering edit mode

Can't think of a way to get accurate strand information without the reads. I think you could likely make an educated guess to separate homozygous mutations from het with MAF but that's about it.

ADD REPLY • link 9.6 years ago by Vivek ★ 2.7k

0

Entering edit mode

I assume that ANNOVAR somehow by looking at the database finds if the SNP is on sense or antisense strand as it assigns the right AS mutations for the corresponding variation. VEP, for instance, needs strand information and therefore I get different results for AS mutation if I assign the same SNP different strand information. But one of the results in VEP will be consistent with the result of ANNOVAR.

ADD REPLY • link 9.6 years ago by tonja.r ▴ 600

3

Entering edit mode

If I'm understanding correctly, this could be simple as checking the reference base at the position and comparing to see if your allele is represented with respect to positive strand. You could do this with samtools, if your base is reverse complimented, you could assign the opposite strand. I think this is what annovar does.

samtools faidx reference.fa 'chr:position-position'

ADD REPLY • link updated 2.5 years ago by Ram 44k • written 9.6 years ago by Vivek ★ 2.7k