How to get a unique target gene for each SNP from different chromosomes/genes in a list at once?
2
0
Entering edit mode
7.4 years ago

Hello, folks!

I have a huge list with SNPs from different chromosomes/genes (human data set), and I wanna know the target genes for each one of them at once (if possible). It would be important to catch only one gene ID per SNP (top rated).


I have tried using the Ensembl Biomart Tool setting the following parameters:

Data set: Human Short Variants (SNPs and indels excluding flagged variants) (GRCh38.p10)

Filters:

  • Chromosome/scaffold: Chromosomes that I know to have those SNPs
  • Filter by Variant name (e.g. rs123, CM000001) [Max 500 advised]: SNP list (rs*)

Attributes

  • Chromosome/scaffold name
  • Variant name
  • Gene stable ID

Nevertheless, I got an even bigger list with repeated SNPs and Ensembl ID's for the target genes when I expected unique and usual gene symbol. For example, I would expect "TP53" for "rs1042522", "LAMB3" for "rs80356682", and so on...

Any help/tip would be greatly appreciated!

SNP gene • 1.8k views
ADD COMMENT
0
Entering edit mode

bedtools intersect ?

ADD REPLY
1
Entering edit mode
7.4 years ago

Try ANNOVAR table_annovar.pl) : http://annovar.openbioinformatics.org/en/latest/user-guide/startup/

ADD COMMENT
1
Entering edit mode
7.4 years ago

You could use BEDOPS bedmap to map positions from vcf2bed to gene names from gff2bed:

$ bedmap --echo --echo-map-id-uniq <(vcf2bed < snps.vcf) <(gff2bed < genes.gff) > answer.bed

This gives you SNP positions and their metadata, along with a list of overlapping gene IDs.

ADD COMMENT

Login before adding your answer.

Traffic: 1744 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6