best annotation approach for peaks
1
1
Entering edit mode
18 months ago
Chironex ▴ 50

hi, I have a question or suggestion for you all. I have some peaks and I would to annotate them. My goal is to extend the annotation 10000 upstream from the start of the gene and 10000 downstream from the end of the gene body. So I don't want to set +10.000/-10000 from TSS. chipseeker seems that doesn't allow to define the parameters like TSS or gene body start/end:

annotatePeak {ChIPseeker}   

Usage
annotatePeak(
  peak,
  tssRegion = c(-3000, 3000),
  TxDb = NULL,
  level = "transcript",
  assignGenomicAnnotation = TRUE,
  genomicAnnotationPriority = c("Promoter", "5UTR", "3UTR", "Exon", "Intron",
    "Downstream", "Intergenic"),
  annoDb = NULL,
  addFlankGeneInfo = FALSE,
  flankDistance = 5000,
  sameStrand = FALSE,
  ignoreOverlap = FALSE,
  ignoreUpstream = FALSE,
  ignoreDownstream = FALSE,
  overlap = "TSS",
  verbose = TRUE
)

On the other hand, maybe I could use chippeakanno, that has this option:

annoPeaks {ChIPpeakAnno}    

Description
Annotate peaks by annoGR object in the given range.

Usage
annoPeaks(
  peaks,
  annoData,
  bindingType = c("nearestBiDirectionalPromoters", "startSite", "endSite", "fullRange"),
  bindingRegion = c(-5000, 5000),
  ignore.peak.strand = TRUE,
  select = c("all", "bestOne"),
  ...
)

setting

bindingType =  "fullRange",
  bindingRegion = c(-10000, 10000)

However, I've tried both:

1) chipseeker <- annotatePeak(as(DARS_up_gr,"GRanges"),
                           tssRegion = c(-10000,10000),
                           TxDb = TxDB.mm10.ensembl, level = "gene",
                           annoDb = "org.Mm.eg.db"
                           ,overlap = 'all')

2) chipepeakanno <- ChIPpeakAnno::annoPeaks(DARS_up_gr,anno_txdb , bindingType = "fullRange", bindingRegion = c(-10000,10000), select = "all")

What do you suggest? what could be the best way to do it?

chipseeker R chippeakanno • 2.5k views
ADD COMMENT
1
Entering edit mode
18 months ago
rfran010 ★ 1.3k

What do you mean exactly? You want to annotate that a peak belongs to a gene if it is within 10,000 bp of that gene? That seems problematic since many genes would overlap the same peak. ChIPseeker annotates the peak with the closest gene. You could take these and then filter for closest genes within 10kb. Defining TSS as -/+ 10kb could be miselading.

I also think in ChIPseeker, you could change flank distance to 10kb and then change addFlankGeneInfo to TRUE and this will list the genes within 10kb of the peak.

I've only used ChIPseeker annotatePeaks, not chippeakanno.

ADD COMMENT
0
Entering edit mode

Yes I think that yours is the best approach, that I adopted for this kind of analysis, thank you

ADD REPLY
0
Entering edit mode

Hi, I don't understand why, if I change TSS +1000/-1000 or +3000/-3000 or +10.000/-10.000 the number of regions annotated is the same. Moreover, a lot of genes have only geneID, I tried conversions but I am not able to get the gene name. Any suggestions? Ps. I tried to add the flankgeneinfo but I get a smaller name of genes annotated...I think there is some error.

ADD REPLY
0
Entering edit mode

The number of regions annotated as "Promoters" is the same regardless of TSS length? You can add this argument to annotatePeak (for mouse Mm or human Hs) annoDb='org.Mm.eg.db' or annoDb='org.Hs.eg.db' Not clear what you mean by "...I get a smaller name of genes annotated"

ADD REPLY

Login before adding your answer.

Traffic: 2671 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6