Obtain rsID GWAS variants (9M)
1
0
Entering edit mode
10 months ago
spolo • 0

Hi everyone,

I have GWAS summary statistics with genetic variants (9M) and I want to obtain the rs that correspond to each of them. I have already download the VCF file of the whole dbSNP database but I do not know how to continue. I have already read the previous answers but they did not solve my problem. So in case anyone can help me I would be thankful. I am working with this type of data:

CHR SNP BP A1
1 chr1:732994:G:A 732994 A
1 chr1:758443:G:C 758443 C
1 chr1:794707:T:C 794707 C
dbSNP GWAS rsID • 449 views
ADD COMMENT
0
Entering edit mode
10 months ago
bk11 ★ 3.0k

You can create the file using Chromosome, Position and SNP (rsid) from the VCF file of dbSNP database and merge that with your GWAS summary stats file.

head dSNP_database.txt
SNP RSID
chr10:10000000 rs1223870629
chr10:100000000 rs925217917
chr10:100000003 rs537453558
chr10:100000004 rs1382475200
chr10:100000005 rs530933119
chr10:10000001 rs369318156

head your_gwas_sumstats.txt
CHR SNP BP A1
1 chr1:732994:G:A 732994 A
1 chr1:758443:G:C 758443 C
1 chr1:794707:T:C 794707 C

awk -F '[: ]' '{print $1,$2":"$3, $4, $5, $6, $7}' test.file.txt |sed '1d' |sed "1i CHR SNP A2 A1 BP A1" >your_gwas_sumstats_to_merge.txt
head your_gwas_sumstats_to_merge.txt
CHR SNP A2 A1 BP A1
1 chr1:732994 G A 732994 A
1 chr1:758443 G C 758443 C
1 chr1:794707 T C 794707 C

You can use R to merge between common SNPs or get all the SNPs with and without RSID of sumstats as follows-

dbsnp <- read.table("dSNP_database.txt", h=T)
sumstats <- read.table("your_gwas_sumstats_to_merge.txt", h=T)
merged_data <- merge(sumstats, dbsnp, by="SNP", all.x=TRUE)
write.table(merged_data, file="your_gwas_sumstats_with_rsid.txt", sep="\t")
ADD COMMENT

Login before adding your answer.

Traffic: 2169 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6