I have a set of genes that contain a protein of interest at the TSS. I would like to be able to separate these genes into two classes: genes with a unidirectional promoter, and genes with a bidirectional promoter.
I have access to pair-end GRO-Seq data, but no RNA-seq data. Is there a way to do this?
technically, do you wan to get the 5' reads that go in opposite directions but overlap with each other ( or present with in certain distance, lets say 400bp ?) Like that of enhancerRNAs which transcribe bi directionally ?
Yes this is what I'd like to do. I had previously thought I could simply look for overlapping regions of gro-seq neg/pos coverage bedgraphs 1k from annotated TSS's using bedtools, but this did not work.
Do you have a separate files for 5' reads ? When you say paired end data, do you know which reads are originated from 5' of a transcript ?
I do not have a separate file for these. What I have currently is a bedtools genomecoverage bedgraph that contains the entire coverage, and is not limited to the -5' option that I generated using:
I'm very new to this GRO-Seq, and the data wasn't generated by my lab so getting information about it's generation has been difficult at best.
If you have paired-End data, somehow you need to separate reads that originated from 5' end. Otherwise you will not be able to find out exactly bidirectional transcripts. Anyway, if you would like to check which of the regions from forward strand are close to regions on reverse strand, you could use the closestBed feature.
But this won't be exclusive to bidirectional transcripts.
Infact, it does not meaningful at all as, in general, paired-end reads maps in fr or rf orientation , so you will definitely end up with may regions that are close to each other on Fw and reverse strand.
Ask the people who generated the data, if they can tell you how to separate reads originated from 5' ends. Then I can tell you how to get bidirectional transcripts.