How to speed up computeMatrix of deeptools?
1
0
Entering edit mode
4.5 years ago

When I docomputeMatrix using all gene regions and extent to 0.5Mb of TSS for Replication timing signal analysis, it has been running for two days. Is there any setting that can speed up the running or other alternative tools to do this?

The code is as follow:

BIN=20000
WIN=500000
RBL=6000
R=hg38GencodeV32.bed 
computeMatrix reference-point \
 --referencePoint TSS \
 -S U2OS_RT_R2-X_z_score_normal.bedgraph.bw \
    Clone110_RT_R2-X_z_score_normal.bedgraph.bw \
    FUSClone110_RT_R2-X_z_score_normal.bedgraph.bw \
 -R $R \
 -a $WIN -b $WIN \
 --skipZeros \
 --missingDataAsZero \
 -p 30 \
 -out results/R2/matrix_RT_TSS_$R"_"$BIN"_WS_"$WIN.gz
software error • 2.7k views
ADD COMMENT
3
Entering edit mode
3.8 years ago
opplatek ▴ 300

I had the same problem. After some testing, it seems better to split the -R files into smaller subsets, calculate matrices with computeMatrix, and then merge them with computeMatrixOperations rbind. computeMatrix reference-point (didn't try scale-region) doesn't scale well with the increasing number of reference positions. For me, the sweet spot is around 5000 per file so I went with that (split -l 5000 in.bed in.chunks) but that might be different in your case. The merge afterwards with computeMatrixOperations rbind is very quick (28 matrices in 7 second). I haven't observed any differences from splitting the -R into chunks vs running it as one file.

Simple results from the testing. Randomly sampled "number of positions" three times for each value (shuf -n 5000 in.bed > in.5000.bed) and made an average of the execution times (all of them were very similar, though).

number of 
positions (-R) | time / seconds 
—————————————————————————————————————————
1,000          | 4
5,000          | 20
10,000         | 58
25,000         | 275
ADD COMMENT

Login before adding your answer.

Traffic: 2789 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6