Should I be aligning to the Masked or Unmasked genome for ATAC-seq samples?
0
0
Entering edit mode
7.9 years ago

I'm aligning reads for an ATAC project and I get vastly different alignment percentages when running a bowtie --very-sensitive alignment to the HardMasked vs Unmasked genome (50% vs 90% alignment rate). This is being done in Dogs (CanFam3.1). Can someone explain to me the differences between the two and why this might occur? Thanks!

ATAC-seq Genome alignment • 3.9k views
ADD COMMENT
1
Entering edit mode

Does "hardMasked" mean repeat masked?

Repeat masking converts all annotated repeats to N's which prevents mapping into these sites. See RepeatMasker

I would generally align to an unmasked genome and strip out the repeats later.

ADD REPLY
0
Entering edit mode

Yes repeats. Okay thanks. What's a good tool to post alignment strip out repeats?

ADD REPLY
1
Entering edit mode

It depends on what you're analysing. Is this straight forward ATAC-seq for detecting DNase HS?

Generally, you would not detect peaks within repeats so it won't be a major problem. If you are finding that a substantial proportion of your peaks are landing in repeats, you could use something like BedTools Intersect to filter out peaks which sit on repeats (which you can download using UCSC TableBrowser).

ADD REPLY
0
Entering edit mode

Yes just a pilot study to look at genome wide chromatin accessibility, aka DNAse I HS.

And thank you so much. I'll look into that. I appreciate your response.

ADD REPLY

Login before adding your answer.

Traffic: 1865 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6