ATAC-seq peaks centered between TSS and TES
2
1
Entering edit mode
12 weeks ago
buffealo ▴ 130

I am analyzing ATAC-seq data and right now trying to confirm my reads enriched around TSS sites however I am obtaining enrichment between TSS and TES for all my samples in two cell lines.

The code I run:

computeMatrix reference-point \
  -S m7_rep1.bw \
  -R hg38.TSS.bed \
  --beforeRegionStartLength 1000 \
  --regionBodyLength 2000 \
  --afterRegionStartLength 1000 \
  --binSize 100 \
  -o m7_rep1_matrix.gz
plotHeatmap \
  -m m7_rep1_matrix.gz \
  -out m7_rep1_TSS_enrichment_heatmap.png

Enrichment

I generated the hg38.TSS.bed

awk '$3 == "transcript"
                {
                    if ($7 == "+") print $1 "\t" $4-1000 "\t" $4+1000; 
                    else if ($7 == "-") print $1 "\t" $5-1000 "\t" $5+1000;}
                ' hg38.refGene.gtf > hg38.TSS.bed

I am also not sure about the black genes.

What could be the thing I am doing or is going wrong?

I appreciate any help, thank you

compute-matrix atac atac-seq tss-enrichment enrichment-graph • 866 views
ADD COMMENT
0
Entering edit mode

To troubleshoot, maybe try separating the transcripts by strand, instead of mixing the two. In your awk statement, for instance, write one or the other strand if-else case and visualize that. This may help highlight the problem.

ADD REPLY
0
Entering edit mode

I concluded that I initially made the wrong configuration of TSSs initially; thank you so much for your directive reply.

ADD REPLY
2
Entering edit mode
12 weeks ago
ATpoint 86k

I think this is a simple labelling issue. I do not know DeepTools well, but based on your awk you simply give it a chr-start-end BED file, so it doesn't know about TSS and TES. All it sees is a 2000bp interval, and you seem to tell it to extend this additionally by 1kb to each direction. The plot shows exactly this. Your actuall TSS (based on the BED file) is centered and perfectly enriched for signal. The borders in your BED file are labelled as TSS/TES (maybe a DeepTools default?) and the additional 1kb stretches up/downstream are called -1.0 and 1.0. I assume the only issue is that you need to change the labelling.

ADD COMMENT
0
Entering edit mode

I got your point. This time, I established my configuration like this:

TSSs: To target the exact TSSs, only considered one location and its 1 bp distance;

 awk '$3 == "transcript" { if ($7 == "+") { print $1"\t"$4-1"\t"$4 } else { print $1"\t"$5-1"\t"$5 } }' hg38.refGene.gtf > exact.TSS

Heatmap generation: I pretty much used the computeMatrix in default settings, except that I added --referencePoint TSS and a piece of calmer color --colorMap 'Greens'

computeMatrix reference-point -S m7_rep1.bw -R exact.TSS --referencePoint TSS  -o m7_rep1_matrix_TSS.gz; plotHeatmap -m m7_rep1_matrix_TSS.gz --colorMap 'Greens' -out m7_rep1_TSS_enrichment_heatmap_exact.png

where I think right now I am only considering the exact transcription start sites (correct positioning) and enrichment around TSS (correct labeling)

Now the result is:

heatmap

ADD REPLY
0
Entering edit mode

Looks reasonable. The missing values probably come from a bigwig that has not included missing signal as zeros.

ADD REPLY
1
Entering edit mode

Thank you so much

ADD REPLY
1
Entering edit mode
12 weeks ago
gglim ▴ 220

If –missingDataAsZero was not set, such cases will be colored in black by default.

see plotHeatmap docs

ADD COMMENT
0
Entering edit mode

Thank you so much.

ADD REPLY

Login before adding your answer.

Traffic: 1407 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6