Entering edit mode
3.6 years ago
Lila M
★
1.3k
Hi there,
Maybe this is a silly question. I have several peaks called with macss (n = 12808) and I want to create two groups based peaks at transcribed genes and peaks at not transcribed genes. My first approach is to annotate peaks with Chipseeker , which gaves me a total of n = 12576 peaks, so I can consider those peaks at transcribed genes. In order to identify the peaks at not trasncribed genes, my idea is use bedtools intersect -v
and the remaining 232 peaks would be consider as not trasncribed. Is this a good approach? or do you guys have any better idea?
Thanks!
Which type data are we talking about? What was the experiment? There are literally no details in this question right now. It the assay measuring transcription or is this a DNA-based genomic assay such as ChIP-seq?
rigth, it is a ChIP seq analysis
What makes you think genes with a peak are transcribed while those without are not? Unless your IP was for Pol2, this is an overly simplistic assumption that I very much doubt is true. Especially since ChIPseeker just assigns to the closest TSS, so some of these peaks will likely not even impact the gene they are assigned to.
indeed, my IP was for Pol2
In that case, wouldn't you consider all peaks to be associated with transcription? Even if your 232 peaks are not associated with genes, they may still mark transcription of unannotated genes, eRNAs, etc.
That is my concern... this is why I asked in here :)
Shouldn't you assess transcription by an actual RNA-based assaay such as CAGE or RNA-seq rather than guessing based on PolII?