Question

Cross validate RNA-seq and ATAC-seq

0

Entering edit mode

14 months ago

Wang Cong ▴ 10

Hi, I have RNA-seq and ATAC-seq for the same samples. I wish to cross-validate two datasets. I got DEGs from RNA-seq and DARs from ATAC-seq. I annotated DARs (differentially accessible regions) using Homer, so I got their nearest gene information. Would it be reasonable to correlate gene expression with DAR nearby genes?

x-axis: log2FC of DARs (ATAC-seq)

y-axis: log2FC of DAR nearly gene (RNA-seq)

My concern is if DAR is far from nearby genes, would it be appropriate to correlate the two fold changes?

Or any other suggestions for ways of cross-validating RNA-seq and ATAC-seq? Thanks!

DAR cross-validation DEG ATAC-seq RNA-seq • 741 views

ADD COMMENT • link updated 14 months ago by ATpoint 86k • written 14 months ago by Wang Cong ▴ 10

score 0 · Answer 1 · 2023-10-29

Annotating peaks (any peak, open chromatin, ChIP/protein) to a gene is arbitrary, there is to my knowledge no reliable method for it that notably outperforms others. Lack of solid ground truth is a big factor in it. Assays like capture HiC tell us that one peak can contact many promoters, sometimes over hundreds of kilobases while on the other hand TAD boundaries can deny near-gene contact even if a gene-peak pair is just a few kilobases apart.

What people do in my experience (unfortunately) is to try something, see whether it supports their hypothesis, and if not keep trying until it does. Confirmation bias at its finest.

The in silico peak-gene association problem is imo one of the big unsolved challenges in biology.