Entering edit mode
17 months ago
Chris
▴
340
Hi all,
I have ATAC-seq and bullk RNA-seq from a disease and control sample and trying to integrative them. After performing different expressed gene and differential accessibility analysis, I find genes that overlap between up-regulated genes and genes in more open chromatin region. I narrow down about 100 genes. Would you suggest what I could do next to find the transcription factors that cause the disease? Thank you so much!
I've used Lisa2 in the past, it tries to find which TFs regulate your set of genes of interest. It uses public Cistrome data for ChIP-seq, ATAC-seq, H3K27ac, etc. but there is also an option in which you can input your gene list and your custom ATAC-seq peaks from your study system to complement the predictions (see here).
Thank you so much for your reply! 100 genes that I narrowed down from bulk RNA-seq and ATAC-seq, I have not sure if they have any connection yet but I still can feed them to Lisa2, is that correct? I see that DiffBind usually return a lot of genes (10-20k) that different in chromatin accessibility between 2 conditions. Lisa will return a lot of ranked transcription factors (TF). So you will try to validate each TF from top to bottom, is that correct?
Yes, you can either feed only the filtered gene list, or feed the filtered gene list plus the ATAC-seq peaks (I think all of the peaks, check the tutorials). You may want to try segregating up- and down-regulated genes. The output is a list of TFs which potentially regulate your gene list, and includes p-values which you can use to explore the data.
Thank you so much :) If I keep TF with p-value < 0.05, there are about 200 TF. May I know what you will do next with this result? I think 1-5 candidates is a good number to validate by wet lab but 200 is too many.
Yes.. well as with any enrichment method (such as doing gene ontologies) results are mostly descriptive and you would need to look more in depth, for example how many specific known genes regulated by the TF show up as differentially expressed, numbers of differential peaks, or if the TF is also differentially expressed, etc. Though I don't know what you mean by wetlab validation
Wet lab validation mean if we think mutation in TF ABC maybe the cause of the difference in the phenotype between conditions, we can genome editing that mutation to observe the phenotype or other wet lab method to validate our hypothesis.