Question

ATAC-seq : call peaks, reproducibility and differential analysis without replicates

0

Entering edit mode

7.0 years ago

anais1396 ▴ 30

Hello biostars community !!

I would like to know if someone had experiences with normalization and differential expression on ATAC-seq data in the case of no replicate in samples ?

I've performed Macs2 for peak calling and used IDR (irreproducible discovery rate) and intersectBed on my first set of ATAC-seq data (which contains replicates) to have peaks in common between my samples but I've a second set of data without replicates and I would like to do the same analyse...

Is there a way to do the same thing on my second set where I have no replicates ? What are the tools I've to use to do that ?

Then, I would like to make differential expression on my two sets of datas. Is the strategy the same in the 2 cases (with replicates and without replicate) ? What strategy do you use for differential expression in ATAC-seq data ?

Thank you in advance !!

Anaïs

sequencing ATAC-seq differential expression • 5.5k views

ADD COMMENT • link updated 7.0 years ago by i.sudbery 21k • written 7.0 years ago by anais1396 ▴ 30

score 1 · Answer 1 · 2018-05-01

1

Entering edit mode

7.0 years ago

i.sudbery 21k

Depends what you mean by no replicates. If you mean you have, for example, a bunch of different patients, some with disease and some without, but you only have one sample from each, then you do have replicates - each patient counts as a replicate.

If you mean you have a cell line and you have treated and untreated cells and only one sample from each, then you do not have replicates.

With no replicates, the only IDR based analysis you can do is to check for internal consistency by generating pseudo-replicates and selfing for self consistency.

Without replicates you basically can't do differential analysis of your ATAC-seq. The best you would be able to do is to get your set of peaks from your unreplicated condition and ask which of those peaks are present in neither of the peak sets from any of the replicates of the replicated set, and ask which peaks that are reproducible in the replicated condition are not present in the unreplicated condition.

ADD COMMENT • link 7.0 years ago by i.sudbery 21k

0

Entering edit mode

I mean I have 2 groups of patients, some with disease and some healthy but in each group, I've just one sample from each patient. For instance, in the group1 (with disease), I've 6 samples (each from a different patient) and in the group 2 (healthy), I've 8 samples, each from a different person.

So if I follow what you said, I can considerer that in the group1 I have 6 replicates and in the group2 I have 8 replicates ??

ADD REPLY • link 7.0 years ago by anais1396 ▴ 30

1

Entering edit mode

Yes, that is what I am suggesting.

ADD REPLY • link 7.0 years ago by i.sudbery 21k

score 0 · Answer 2 · 2018-05-01

0

Entering edit mode

7.0 years ago

Devon Ryan 105k

For the unreplicated experiment you just won't be using IDR.

I assume by "differential expression" you mean integrating your ATAC-seq data with RNA-seq data. At the end of the day that's the same regardless of whether the ATAC-seq has replicates or not. You're generally looking for DE genes downstream of or overlapping ATAC-seq peaks. You could do differential peak calling between the conditions and look at DE genes in the resulting peaks, that'd be the simplest route.

ADD COMMENT • link 7.0 years ago by Devon Ryan 105k

0

Entering edit mode

I think the OP was reffering to the analysis where you generate a pooled set of peaks between the two samples and then count the number of ATAC reads in each peak and do a count bases analysis of the difference between the two conditions.