Entering edit mode
18 months ago
Chris
▴
340
Hi all, I would like to do integrated the genes from bulk-RNA and genes from ATAC-seq as this paper did but don't know the code to do this. For bulk RNA seq, I got around 300 up and down-regulated genes between 2 conditions but ATAC seq got around 10k genes that fold change > 2.5.
Would you please have a suggestion? Thank you so much.
https://www.frontiersin.org/articles/10.3389/fnut.2021.742672/full
hi Chris - what is the p-value that you're using for the cutoff when looking at your ATAC data? It looks like they used the following: different peaks were filtered with P < 0.00001 |log2(fold change)| >= 1.
Hi Sasha, if I use log2 fold > 2.5 and p < 0.00001, my object from Diffbind has around 20k observations.
Hmm - maybe there is some confusion in the phrasing of what they are communicating and how we are interpreting it. It may be that they also had 20K observable peaks but the peaks happened in the area of specific genes. What is that number when you look at it from a gene point of view.
With the filtering values above, there are around 8k unique genes in my data. My question is the code to get the overlap genes as they did.
One thing that you could potentially try is
bedtools intersect
to find the overlapping regions between the ATAC-seq peaks and the RNA-seq gene annotations. Here's an example command:You can read more about it here: https://bedtools.readthedocs.io/en/latest/content/tools/intersect.html
DISCLAIMER: I'm using my chatbot here (https://tinybio.cloud) to help generate this answer. This answer has not been tested and may be incorrect. You can download it from the website. If this answer does not work - please let me know in this thread!
Thanks Sasha for the suggestion. Interesting chatbot. Don't know why I didn't get notified about your reply. Maybe
-b rna.bam
because rna file output doesn't have a gtf file.I have 4 bam files for 2 conditions, 2 replicates for each condition. Do you think I should merge them into one bam file to run the bedtools command?
Hi Chris - if you could post more information describing the ATAC and RNA separately that would be great.
But, given what I understand; maybe try to merge the gff files and the bed files first and then do an intersect. Other people may have better ideas.
Seem I don't have gff file in both RNA-seq and ATAC seq analysis. Would you please tell me what information about ATAC and RNA-seq you want me to add?
To do the intersect using bedtools intersect on ATAC and RNA data, you would need BED files representing the genomic regions of interest for both data types. For ATAC-seq data, you would typically have a BED file containing the genomic coordinates of open chromatin regions, while for RNA-seq data, you might have a BED file containing the genomic coordinates of expressed genes or exons.
See if that works for ya.
Here's a very basic example:
Thank you for your help. I found this tutorial mentioning the bed file from ATAC seq. Before this, I haven't seen the bed file from ATAC-seq analysis yet, only bigwig, bam, broadpeak files. https://cambiotraining.github.io/chipseq/Practicals/ATACseq/ATACseq_tutorial.html
You might get a way with just using the bams. You can see the options listed here for bedtools intersect: https://bedtools.readthedocs.io/en/latest/content/tools/intersect.html
Look at the -bed flag.
Hope this helps!