Hello everyone . I have a list of regions mapped to GRCh 38 and I want to find the name of genes that map to them.
first I tried annotatePeak
function from ChIPseeker
package , which return EntrezID of nearest gene as well as annotation (i.e, Promoter , Distal , Exon ,...) and then I converted geneIDs to gene names using getSYMBOL
function from annotate
package .
G.ranges<- as_granges(ranges , seqnames=seqnames , start=start , end=end )
txdb<- TxDb.Hsapiens.UCSC.hg38.knownGene
annotated<-annotatePeak(G.ranges , TxDb = txdb ,level = "gene" , addFlankGeneInfo=TRUE )
a1<- as.data.frame(annotated@anno)
a1$symbol<- getSYMBOL(annotated@anno$geneId , data = 'org.Hs.eg.db')
I also tried this approach which uses ucsc refGene.
The problem is that using these 2 methods for some regions I get different gene names.
for example for
chr2:112541661_112542162 the first approach returns POLR1B whereas the second method using ucsc refGen returns LOC105373562.
I was wondering if there is a problem with my code using annotatePeak function ?