Hello, folks!
I have a huge list with SNPs from different chromosomes/genes (human data set), and I wanna know the target genes for each one of them at once (if possible). It would be important to catch only one gene ID per SNP (top rated).
I have tried using the Ensembl Biomart Tool setting the following parameters:
Data set: Human Short Variants (SNPs and indels excluding flagged variants) (GRCh38.p10)
Filters:
- Chromosome/scaffold: Chromosomes that I know to have those SNPs
- Filter by Variant name (e.g. rs123, CM000001) [Max 500 advised]: SNP list (rs*)
Attributes
- Chromosome/scaffold name
- Variant name
- Gene stable ID
Nevertheless, I got an even bigger list with repeated SNPs and Ensembl ID's for the target genes when I expected unique and usual gene symbol. For example, I would expect "TP53" for "rs1042522", "LAMB3" for "rs80356682", and so on...
Any help/tip would be greatly appreciated!
bedtools intersect ?