Entering edit mode
7.8 years ago
mms140130
▴
60
I have the following data set about the snps ID
CHROM POS ID
chr7 78599583 rs987435
chr15 33395779 rs345783
chr1 189807684 rs955894
chr20 33907909 rs6088791
chr12 75664046 rs11180435
chr1 218890658 rs17571465
chr4 127630276 rs17011450
chr6 90919465 rs6919430
and a gene reference file
genename name chrom strand txstart txend
CDK1 NM_001786 chr10 + 62208217 62224616
CALB2 NM_001740 chr16 + 69950116 69981843
STK38 NM_007271 chr6 - 36569637 36623271
YWHAE NM_006761 chr17 - 1194583 1250306
SYT1 NM_005639 chr12 + 77782579 78369919
ARHGAP22 NM_001347736 chr10 - 49452323 49534316
PRMT2 NM_001535 chr21 + 46879934 46909464
CELSR3 NM_001407 chr3 - 48648899 48675352
I'm trying to match the genes with the SNps location, so include the snps that has
postion >= txstart and position<= txend
for example I want a data set that has the following columns
genename SNPID chrom position txstart txend
so how can I use GRanges ?? do you have a code you can share
The text in blue (GRanges) is a hyperlink; if you click on it, voila! It takes you to the magic code repository!
If you follow the link and look at the PDF documents ('GenomicRanges HOWTOs') halfway down the page, all will be revealed! This does require some knowledge of R, but a little time spent now is, in my honest opinion, well worth it.