DMR Analysis for RRBS: what annotation file to I use?
1
0
Entering edit mode
20 months ago
embueno • 0

Hi,

I’m interested in running a DMR analysis on RRBS data. I was wondering what type of annotation file I need for the program (i.e methylkit or others) to identify promoters vs introns etc. Would a gtf file be enough?

I work with a non model organism (Leptinotarsa decemlienata) so I’m having some trouble following package tutorials that annotate their results using other R based packages. I’m just getting started on this so I would appreciate any input I can get.

Thanks!

Methylation RRBS Annotation • 1.2k views
ADD COMMENT
0
Entering edit mode
20 months ago
Papyrus ★ 3.0k

If you have a custom GTF for your model species, you could try importing it into R with GenomicFeatures::makeTxDbFromGFF or rtracklayer::import to create a TxDb object. You could then use this object to annotate your regions of interest, for example using the ChIPseeker::annotatePeak function from the well-known package ChIPseeker.

Another package which has options for custom annotations is annotatr.

ADD COMMENT
0
Entering edit mode

once you have the txdb object , you can follow this to get the promoters and others

### Define intronic, exonic and intergenic regions
```{r}
library(AnnotationHub)
library(dplyr) ## for %>%
ah = AnnotationHub()
possibleDates(ah)
AnnotationHub::query(ah, c("gtf", "Homo_sapiens", "GRCh37"))
GRCh37.gtf<- ah[['AH10684']]
## subset the gtf files for only protein_coding genes and lincRNAs
# GRCh37.gtf<- GRCh37.gtf[GRCh37.gtf$gene_biotype %in% c("protein_coding", "lincRNA")]
table(GRCh37.gtf$gene_biotype)
## make a txdb
library(GenomicFeatures)
GRCh37.txdb <- makeTxDbFromGRanges(GRCh37.gtf)
## exons
exonsBy(GRCh37.txdb, "gene") %>% unlist() %>% reduce()
exons(GRCh37.txdb) %>% reduce()
## introns
intronsByTranscript(GRCh37.txdb) %>% unlist() %>% reduce()
### get intergenic regions
chrom_grngs <- as(seqinfo(GRCh37.txdb), "GRanges")
collapsed_tx <- reduce(transcripts(GRCh37.txdb))
## or use unstrand()
strand(collapsed_tx) <- "*"
intergenic <- GenomicRanges::setdiff(chrom_grngs, collapsed_tx)
fiveUTRsByTranscript(GRCh37.txdb)
threeUTRsByTranscript(GRCh37.txdb)
```
### CpG islands, shores,
```{r}
library(AnnotationHub)
ah = AnnotationHub()
possibleDates(ah)
head(unique(ah$rdataclass))
cgi<- ah[["AH5086"]]
## follow http://chitka-kalyan.blogspot.com/2014/03/cpg-island-shelves.html
###############################################################
# Extract CpG island shores
###############################################################
# extract the shore defined by 2000 bp upstream of cpg islands
shore1<- trim(flank(cgi, width = 2000, start= TRUE))
# extract the shore defined by 2000 bp downstream of cpg islands
shore2<- trim(flank(cgi, width = 2000, start = FALSE))
# perform intersection and combine the shores where they overlap
shore1_2<- reduce(c(shore1,shore2))
# extract the features (ranges) that are present in shores only and not in CpG islands (ie., shores not overlapping islands)
cpgi_shores<- setdiff(shore1_2, cgi)
cpgi_shores$name<- paste("shore", 1: length(cpgi_shores), sep ="_")
###############################################################
# Extract CpG island shelves
###############################################################
# extract the shore defined by 4000 bp upstream of cpg islands
shelves1<- trim(flank(cgi, 4000))
# extract the shore defined by 2000 bp downstream of cpg islands
shelves2<- trim(flank(cgi, 4000, FALSE))
# perform intersection and combine the shelves where they overlap
shelves1_2<- reduce(c(shelves1,shelves2))
# create a set of ranges consisting CpG Islands, Shores
island_shores<- c(cgi, cpgi_shores)
# extract the features (ranges) that are present in shelves only and not in cpg_islands or shores(ie., shelves not overlapping islands or shores)
cpgi_shelves<- setdiff(shelves1_2, island_shores)
cpgi_shelves$name= paste("shelf", 1: length(cpgi_shelves), sep ="_")
#### use annotatr bioc Package!!
library(annotatr)
## built-in short cut for cpg island, shores and shelves
annots<- 'hg19_cpgs'
cpg.annotations<- build_annotations(genome = 'hg19', annotations = annots)
table(annotations$type)
```

ADD REPLY

Login before adding your answer.

Traffic: 1472 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6