Question

ATACseq and HiC sequencing

0

Entering edit mode

7.6 years ago

Floris Brenk ★ 1.0k

Hi there,

We are looking into ATACseq and HiC sequencing of some human derived samples. Does anyone has any recommendation about sequencing depth? e.g. how many samples to pool per lane on Illumina HiSeq. Have been looking around but no real data is present on this. Also is there some best practices about the analysis on these techniques? Perhaps worth to put a chapter in the new biostarsbook?

atac hic • 6.0k views

ADD COMMENT • link updated 7.6 years ago by Devon Ryan 105k • written 7.6 years ago by Floris Brenk ★ 1.0k

1

Entering edit mode

There are many studies out there using ATAC and HiC. You can check the sequencing depth people usually go for.

Best way to sequence ATAC is 50bp PE and usually 60 million reads. If you are interested in generating the TF footprints, you might need hundreds of millions of reads. No idea about Hi-C depth. In general, you might need to get at least 100 million mapped and valid pairs at the end, which might require a sequencing depth of 300-400 million PE reads.

ADD REPLY • link 7.6 years ago by GouthamAtla 12k

1

Entering edit mode

No clue about Hi-C, but for ATAC-seq it really depends on the study's aim. What do you plan to do? For peak calling and differential expression, people typically aim for 25-30mio reads after all filtering, and at least a duplicate. It really depends on the percentage of mitochondrial reads in the sample, which (if you do not deplete) can by up to 80%. Also, the complexity per library is limited, so sequencing (in my experience) far beyond 30mio filtered reads will mainly pick up duplicates, so additional replicates are necessary for increased complexity. Towards analysis, it is typically treated like ChIP-seq. Call peaks with the MACS (or any tool of choice, disabling any shifting model), make a consensus peak list over all conditions and get a count matrix, which then goes into DESeq2 or similar frameworks. There are some tools from the Greenleaf lab (chromVAR) for linking TF motif presence with chromatin accessability and NucleoATAC for nucleosome position calling, but especially the latter (as far as I understood the paper, which demonstrates the technique using the (tiny) yeast genome) requires quiet many replicates and deeply sequenced samples.

ADD REPLY • link 6.7 years ago by ATpoint 88k

1

Entering edit mode

HiC will depend on the cutter you end up using.

ADD REPLY • link 7.6 years ago by Devon Ryan 105k

0

Entering edit mode

Was thinking about this method HiC2 Very clear paper and protocol but no guidelines on sequencing depth... Just one sentence "Datasets from Rao et al. were selected solely based on their read depth that was comparable to datasets obtained with Hi-C 2.0 (100–200 million reads)"

ADD REPLY • link 7.6 years ago by Floris Brenk ★ 1.0k

1

Entering edit mode

It also depends on the resolution of the Hi-C domains you want to detect. According to this guide, 100mio filtered reads are sufficient for a resolution of 40kb.

ADD REPLY • link 7.6 years ago by ATpoint 88k

0

Entering edit mode

Great thanks all for your help, much appreciated! Got one more additional questions, is there any need to PhiX spike-in during HiC or ATAC sequencing? or can the libraries just go on hiseq without issues?

ADD REPLY • link 7.6 years ago by Floris Brenk ★ 1.0k

score 0 · Answer 1 · 2017-10-10

Most of our recent HiC runs have been for flies using DpnII as the cutter and there we ideally do ~100 million reads. Note that this is for a VERY high resolution map (think <1kb resolution). To put that in human terms you should sequence a HiSeq flow cell or two. It's more likely to be in your budget to use something like HindIII, which will cut much less frequently, and then scale down your sequencing depth to match that (see the link from ATPoint above).