How to assign alleles between genotype builds? Understanding strand changes between builds
1
0
Entering edit mode
8.1 years ago

Hello,

I have extracted some genome wide association information from consortium results for a list of SNPs i'm interested in. The results are based on positive strand of build GRCh36. I have no problems updating rs numbers, and position using a combination of biomaRT and liftOver. However i am unsure on how to assign alleles. Now i know that biomaRT allele information is all on the positive strand. This is an easy task for un-ambiguous SNPs (e.g C/T, A/G). For ambiguous SNPs i.e (C/G and A/T) I can assign alleles based of matching allele frequencies, but this might become more difficult for SNPs with minor allele frequencies close to 0.5. More so since biomaRT gives you overall MAF across all populations while the consortium MAFs are for Europeans only.

The ideal solution would be to know whether the strands have changed between builds for my list of SNPs (e.g build 36 to 37 and/or build 37 to 38). Can such information be found? Am i missing an obvious solution to this problem?

SNP biomart Assembly orientation strand • 2.5k views
ADD COMMENT
0
Entering edit mode

Basically what I'm looking for is a method to track SNP orientation changes between builds. It would seem that this is something that should be available. Like an orientation key relative previous build.

ADD REPLY
0
Entering edit mode

While converting between coordinates if the input snp bed file has strand information (6th column) then the output also should have have a strand information so any strand changes could be determined. In all using a 6 column bed file of input snps for lifting between assemblies should solve this problem. I use CrossMap and made a dummy file bed file and got following output

Chr1 1 2 a 10 + -> Chr1 1 2 a 10 +

Chr2 20 21 b 1000 + -> Chr2 20 21 b 1000 +

Chr3 22 23 c 1000 + -> Chr3 22 23 c 1000 +

ADD REPLY
0
Entering edit mode

Will install this and give it a try. Just to be clear, in my case i have the position but none of the other BED column information (ie column 4 'name', 5 'score', 6 'strand'). But your dummy example would suggest that i can fill in dummy information into these columns and still get the correct information for the destination build in the CrossMap output?

ADD REPLY
0
Entering edit mode

Yes I would say so as you mentioned that the reported SNPs are from plus strand so this information can be added to the 6th column and 4th and 5th filled with dummy data. I think webliftover can also do this provided a 6 column bed is provided. However I have not done that for human data but I was successful in assigning SNPs across builds with Crossmap for a plant species. A good indication would be an entry in the conversion output where the strand is shown as negative.

ADD REPLY
0
Entering edit mode

Thanks, i tried the weblifover and the 4th,5th and 6th column seem to have no effect. Output is 3 columns only. Will work with Crossmap. Seems to have many advantages anyway but it will take me some time to get it up and running. Thanks for all your help microfuge.

Sweet this worked just fine!

ADD REPLY
1
Entering edit mode
8.1 years ago

Following the comments by @microfuge, I set up a .BED file form my SNPs with dummy variables for columns 4 and 4 and strand info for column 6 (+ve), then used CrossMap to flip between builds as needed. CrossMap was able to track changes between builds for the orientation of the vast majority of my SNPs.

ADD COMMENT

Login before adding your answer.

Traffic: 2274 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6