Mapping SNP rsIds/ positions to genes coding for proteins
1
0
Entering edit mode
14 months ago
Harsh ▴ 10

I have started a new project and am a beginner in bioinformatics. I have a summary statistics file from a GWAS study. The file has information like the Chromosome number, position of the SNP. The rsids are not included. Now I want to map these SNPs to Genes that code for proteins. With the condition: SNP should 50kb upstream or downstream of the gene boundary, or inside the gene boundary.

Thank You

SNP rsid • 997 views
ADD COMMENT
1
Entering edit mode
14 months ago

Here's one way to do it:

To get SNPs, assuming you are working with hg38 human GWAS data:

wget -qO- https://ftp.ncbi.nlm.nih.gov/snp/organisms/human_9606_b151_GRCh38p7/VCF/All_20180418.vcf.gz \
    | gunzip -c \
    | vcf2bed --sort-tmpdir=/tmp --max-mem=2G - \
    | awk -v FS="\t" -v OFS="\t" '{ print "chr"$0 }' \
    > hg38.dbSNP155.bed

Take the difference of padded and unpadded gene regions to get the genomic space covering upstream and downstream of the gene boundaries, but not including the gene (if I am understanding your problem's condition correctly):

bedops --difference <(bedops --range 50000 --everything genes.bed | bedops --merge -) genes.bed > regions.bed

Then map the SNPs to the regions:

bedmap --echo --echo-map-id --delim '\t' regions.bed SNPs.bed > answer.bed

If your condition is that the SNP can be within the gene or 50kb upstream or downstream of the gene boundary, then you can skip a lot of the complication and just map directly:

bedmap --echo --echo-map-id --delim '\t' <(bedops --range 50000 --everything genes.bed) SNPs.bed > answer.bed

If you want to remove the 50kb of padding in the answer:

bedmap --echo --echo-map-id --delim '\t' <(bedops --range 50000 --everything genes.bed) SNPs.bed | bedops --range -50000 - > answer.bed
ADD COMMENT
0
Entering edit mode

Actually the condition is, SNP should be within the gene boundary or 50kb upstream or 50kb downstream of the gene boundary.

ADD REPLY
1
Entering edit mode

See edited answer.

ADD REPLY
0
Entering edit mode

thank you!

ADD REPLY

Login before adding your answer.

Traffic: 1665 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6