Hey everyone,
I am currently working on transcription factor binding in loops for my internship. I have just encountered a problem.
Let me summarize:
In one of the side, there is a TF binding the DNA and the other side, there is a gene. So the idea is simple: Generate a mutation in TF binding site and see if there is still a transcription of the gene in the other side.
Before to apply some mutations, I have to find 5 candidates. The problem is, I have about 1,000 candidates (for me, candidate is the loop with TF in one side, and the gene in the other side). So my idea is to select the most expressed genes among my candidates. In the end, it is to check if the TF is really involved in the transcription of the gene. (I don't know if it is the cleanest way to process but I didn't find an other condition.)
My question is simple: Does it have a function in R or a file to have this kind of information about the gene expression. So the type of genes is the GRange object, looks like this:
ENSG00000132196 chr1 [162790702, 162812817] + | ENSG00000132196
If you have an other solution to apply a selection among my genes, let me know.
Thank you very much,
Baptiste
I couldn't decipher your question correctly. Do you want a list of genes with expression values to cross compare with your own list, so that you can sort the top ranked genes out of your list? What are the sides you are talking about.
Hey Sukhdeep, thank you for your reply.
So I have about 1,000 genes in GRange type. For each Genes, I want to know (if it's possible) the expression values. Then I will select the top 5 most expressed genes.
Expression is specific to a cell or tissue. A gene may be highly expressed at one tissue and time, while not-expressed in another tissue or time. You will need data and specificity.
for "loop" do you mean "chromatin loop" like in a 3C assay where you test whether a distal regulatory element/region regulates gene transcription? and you want to find out how to correlate the presence of a TF with gene expression?
It's chromatin loop from HiC data. Then, for each loop I chekc if there is a TF binding event in one of the side, and a gene in the other side. Kind of correlation, if we cause a mutation in the TF binding site ( TF can't bind on the DNA), Will I observe a decline of gene expression .
Hey thank you very much for your reply.
I have just remembered that is really important to know the cell type for this.
But I am not sure if I have to do this. I will see.
Thanks again,
Baptiste