How can I convert position to SNP ID
3
0
Entering edit mode
6.6 years ago
Jenny Han ▴ 10

I have chromosome number and SNP position. (about million)

How can I convert these information to SNP ID?

SNP translation gwas • 16k views
ADD COMMENT
2
Entering edit mode
ADD REPLY
6
Entering edit mode
6.6 years ago

Grab SNPs and convert them to sorted BED. Once they are in BED format, you can convert your positions to BED and do a BEDOPS bedmap operation to map SNP IDs that associate with positions.

For example, here is a way to download dbSNP v150 for hg19 and convert it to BED with BEDOPS vcf2bed:

$ wget -qO- ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606_b150_GRCh37p13/VCF/All_20170710.vcf.gz | gunzip -c - | vcf2bed --sort-tmpdir=${PWD} --max-mem=2G - > hg19.dbSNP150.bed

You'd modify this for your reference genome, if you're not working with hg19.

Then convert your positions to a sorted BED file, using awk and BEDOPS sort-bed:

$ awk -vOFS="\t" '{ print "chr"$1, ($2 - 1), $2; }' positions.txt | sort-bed - > positions.bed

This assumes that the chromosome number is strictly numerical (i.e., Ensembl format, and not UCSC format). So we add a chr prefix to this number, so that the chromosome names in the BED file positions.bed will match the chromosome names in the BED file hg19.dbSNP150.bed. Modify this depending on the format of chromosome names in your original positions.txt file.

Finally, you can map positions to SNP IDs:

$ bedmap --echo --echo-map-id --delim '\t' positions.bed hg19.dbSNP150.bed > answer.bed

The file answer.bed will have the positions in the first three columns, and the SNP rs-ID in the fourth, last column.

ADD COMMENT
0
Entering edit mode

Dear Alex Reynolds,

Thanks for such an efficient method! But I still have some doubts. Using this method only matches the chr:start:end information, which results in multiple rsids being merged to the same variant, and should more accurately be combined with the ref:alt information. Is there a way to take into account the ref:alt information additionally?

ADD REPLY
3
Entering edit mode
6.6 years ago

In addition to data description, you may want to post example data for better suggestion.

$ bedtools intersect  -a test.txt -b dbsnp_mini.vcf -wa -wb

example records:

$ cat test.txt 
chrom   from    to
1   17571   17571
1   17594   17594

output:

1   17571   17571   1   17571   rs557947346 C   T   .   .   RS=557947346;RSPOS=17571;dbSNPBuildID=142;SSR=0;SAO=0;VP=0x0500000a0005000000000100;WGT=1;VC=SNV;INT;R5;ASP
1   17594   17594   1   17594   rs377698370 C   T   .   .   RS=377698370;RSPOS=17594;dbSNPBuildID=138;SSR=0;SAO=0;VP=0x0500000a0005000002000100;WGT=1;VC=SNV;INT;R5;ASP;OTHERKG
1   17614   17614   1   17614   rs201057270 G   A   .   .   RS=201057270;RSPOS=17614;dbSNPBuildID=137;SSR=0;SAO=0;VP=0x050000020005000002000100;WGT=1;VC=SNV;R5;ASP;OTHERKG
ADD COMMENT
2
Entering edit mode
6.6 years ago
Emily 24k

VEP

ADD COMMENT
0
Entering edit mode

Sorry @Emily_Ensembl I was trying to annotate somatic copy number variation in vcf format by VEP but I got this error

Could you please help me with that?

http://grch37.ensembl.org/Multi/Tools/VEP/Ticket?tl=DqLvWXeQg18fDsnn

ADD REPLY
0
Entering edit mode

What does the error message say?

ADD REPLY
0
Entering edit mode

Thank you

Error:

-------------------- EXCEPTION --------------------
MSG: 
ERROR: Forked process(es) died: read-through of cross-process communication detected

STACK Bio::EnsEMBL::VEP::Runner::_forked_buffer_to_output /nfs/public/release/ensweb/latest/live/grch37/www_95/ensembl-vep/modules/Bio/EnsEMBL/VEP/Runner.pm:554
STACK Bio::EnsEMBL::VEP::Runner::next_output_line /nfs/public/release/ensweb/latest/live/grch37/www_95/ensembl-vep/modules/Bio/EnsEMBL/VEP/Runner.pm:361
STACK Bio::EnsEMBL::VEP::Runner::run /nfs/public/release/ensweb/latest/live/grch37/www_95/ensembl-vep/modules/Bio/EnsEMBL/VEP/Runner.pm:202
STACK EnsEMBL::Web::RunnableDB::VEP::run /nfs/public/release/ensweb/latest/live/grch37/www_95/public-plugins/tools_hive/modules/EnsEMBL/Web/RunnableDB/VEP.pm:87
STACK (eval) /nfs/public/release/ensweb/latest/live/grch37/www_95/ensembl-hive//modules/Bio/EnsEMBL/Hive/Process.pm:140
STACK Bio::EnsEMBL::Hive::Process::life_cycle /nfs/public/release/ensweb/latest/live/grch37/www_95/ensembl-hive//modules/Bio/EnsEMBL/Hive/Process.pm:127
STACK (eval) /nfs/public/release/ensweb/latest/live/grch37/www_95/ensembl-hive//modules/Bio/EnsEMBL/Hive/Worker.pm:681
STACK Bio::EnsEMBL::Hive::Worker::run_one_batch /nfs/public/release/ensweb/latest/live/grch37/www_95/ensembl-hive//modules/Bio/EnsEMBL/Hive/Worker.pm:652
STACK Bio::EnsEMBL::Hive::Worker::run /nfs/public/release/ensweb/latest/live/grch37/www_95/ensembl-hive//modules/Bio/EnsEMBL/Hive/Worker.pm:500
STACK main::main /nfs/public/release/ensweb/latest/live/grch37/www_95/ensembl-hive//scripts/runWorker.pl:141
STACK toplevel /nfs/public/release/ensweb/latest/live/grch37/www_95/ensembl-hive//scripts/runWorker.pl:22
Date (localtime)    = Thu Mar 28 15:54:32 2019
Ensembl API version = 95
ADD REPLY
0
Entering edit mode

What does it say above that?

ADD REPLY
0
Entering edit mode

It says my input file is invalid?

ADD REPLY
1
Entering edit mode

What does it instruct you to do?

ADD REPLY
1
Entering edit mode

If input file is not valid I am not sure what is the solution because this .vcf file directly comes from ASCAT somatic copy number caller

ADD REPLY

Login before adding your answer.

Traffic: 1692 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6