Entering edit mode
8.5 years ago
avari
▴
110
Hello,
I want to find the start and stop position of a particular intron for a human gene and then exact SNPs in my dataset that fall within this window using Plink. Where would be a good place to find this information ?
Thanks!
I would do following
bedtools intersect
function on these two BED files to find out SNPs falling in intronic regions.Thanks for the instructions, that seems useful but a little complicated! I'm afraid I don’t have a bioinformatics background so I am unfamiliar with some of these tools. I take it there is no quick and dirty way of searching for this in some database? Just looking for the start/stop base pair positions.
My suggestion would be to get familiar with these tools & UNIX environment. I can't say its not possible to do these tasks without learning those tools (Though I'm a bit biased to the end of 'not possible'), it would be much more easy to do such things if you are familiar.
And take a look at this link: How do I get the position of a SNP when it is located in an intron?. It may help but I am not sure whether its a perfect solution.
If you have the SNP coordinates and the alleles e.g. 15 2233 2233 A/G (ref/alt), you can use the Variant Effect Predictor to find out where your SNPs map to (intronic regions or not) and what is their effect on proteins, transcripts, regulatory elements, etc. If your species is not in Ensembl, you could go about building a cache from GTF or GFF file..
Thanks, that might work actually as I can extract all the SNPs for the gene in Plink first and then feed them in. Will give it a try. I am specifically interested in the third intron, so hopefully the intron number is specified.
You can do this without programming using UCSC Genome Browser. Navigate to the region of interest (your intron). Then go to Tools > Table Browser. Under "region", select "position" to limit yourself to that region. Under "group" and "track", select the appropriate SNP track.
Thanks, that sounds useful too!