How can I find the promoter regions that have H3k27me3 in two different groups
1
0
Entering edit mode
5.6 years ago
munaj86 ▴ 30

Hi..

I'm new to ChIPseq analysis, I performed all step from QC to peak call using MACS2. I have MACS2 peaks file (narrow and broad peak file) for two different histone marks which are h3k27me3 and h3k4me3. I want to find the promoters around transcription start site (TSS plus and minus 1000bp) that have these histone marks in two different group. Can anybody help me with that?? What are the tools that I can use to do this.

Thanks,,

ChIP-Seq • 2.1k views
ADD COMMENT
5
Entering edit mode
5.6 years ago
Prakash ★ 2.2k

There could be several ways you could this. one way is that you could use biomaRt to extract the TSS regions of organism of your interest and GenomicRanges to resize the TSS to plus minus 1000bp and then intersect your peaks file of histone mark using bedtools intersect.

you can use below R code for this purpose .

library(biomaRt)
library(GenomicRanges)
mart = useMart('ensembl')
listDatasets(mart)
ensembl = useMart( "ensembl", dataset = "mmusculus_gene_ensembl" )
listAttributes(ensembl)
tss <- getBM( attributes = c("chromosome_name","transcript_start","transcript_end","external_gene_name"),mart = ensembl )
tss <- tss[which(tss$chromosome_name != "MT"),]
head(tss)
tss$chromosome_name <- paste0("Chr",tss$chromosome_name)
gr <- GRanges(seqnames=Rle(tss$chromosome_name),
              ranges = IRanges(tss$transcript_start, end=tss$transcript_end),
              names= tss$external_gene_name)
resizeRanges <- resize(gr, width = 1000,fix = 'start')
head(resizeRanges)
write.table(resizeRanges, file="tss.bed", quote=F, sep="\t", row.names=F, col.names=F)
ADD COMMENT
0
Entering edit mode

Hi Prakash.. Can I use the genome annotation file from genecode rather than ensemble for biomart to get the TSS regions??

ADD REPLY
2
Entering edit mode

GENCODE annotation is Ensembl annotation.

ADD REPLY
0
Entering edit mode

Hi, I agree that essentially GENCODE and Ensembl annotations are same but GENCODE has annotation from both HAVANA and Ensembl. As Munaj86 had this query that he/she wanted to access GENCODE from biomaRt, that's why I replied in that context.

Thanks

ADD REPLY
3
Entering edit mode

What is my username? What does that tell you?

The Havana annotators sit at the other end of the corridor to me. The Ensembl automatic annotators sit halfway down the same corridor. Every month, money from the Ensembl grant goes into my bank account. I know what I'm talking about.

munaj86: Prakash does not know what they are talking about. Do not listen to them.

The annotation presented in Ensembl is GENCODE. GENCODE is a brand name used to describe the annotation from Ensembl for human and mouse. This consists of the merged data from the Ensembl automatic pipeline and the Havana manual annotation. There is nowhere in the world that you can get hold of the automatic annotation alone or the manual annotation alone. You can only get hold of the two of them merged together and you can call it either the Ensembl annotation or the GENCODE annotation – it is the same.

ADD REPLY
0
Entering edit mode

Thanks Emily for clarification ! I have edited my comment.

ADD REPLY
0
Entering edit mode

yes you can use that as well

ADD REPLY
0
Entering edit mode

I did try to look at mart function and list the available human annotation file and it is only display ensemble, how can I use the gencode annotation file using usemart() function. I already have a gtf file of human genome file from gencode I just don't know how to use in mart() function.. Any suggestion??

ADD REPLY
0
Entering edit mode

As Emily clarified about GENCODE and Ensembl , you could use ensembl to get chromosome location in bed format and resize it to plus minus 1000bp.

ADD REPLY
1
Entering edit mode

This is wrong. GENCODE == Ensembl.

ADD REPLY
0
Entering edit mode

Thanks a lot for the code, @Prakash! Really helpful!!!

ADD REPLY

Login before adding your answer.

Traffic: 1362 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6