- PanDepth is a high-performance tool for calculating coverage in sequencing data, outperforming other tools in speed for both BAM and CRAM-format alignment files, regardless of read length.
- PanDepth accepts sorted or unsorted BAM and CRAM-format alignment files and GTF/GFF/BED-formatted interval files, or a specific window size
- PanDepth is memory efficient, making it an attractive choice for large-scale genomic data analysis.
- The statistical results of PanDepth on depth and coverage are completely consistent with samtools.
You can get the PanDepth code and manual on github here
Figure: The computation time comparison of seven software tools using 150GB sequencing reads in different numbers of threads for genome coverage calculations.
Can PanDepth output the base coverage for all position?
I regret to inform you that PanDepth does not support outputting the base coverage for all position due to the extremely time-consuming of this process and the large size of the output file. As not every base position requires in-depth analysis, you can use the ‘-w 200’ parameter to divide the whole genome into non-overlapping 200 bp windows. Then, based on the results of these sections, you can select the sections you are interested in and use tools like 'samtools depth' to output the base coverage for each position in your selected sections.
There are datatypes where this kind of coverage calculation would be useful; even if time-consuming, if your toolkit scales to that kind of analysis it would be preferable to more manageable samtools/pysam-based approaches.
We greatly appreciate your suggestion. In the forthcoming version, we will incorporate a feature to report the depth of coverage across all positions.
Thank you very much for your suggestion. The latest version of PanDepth (v2.21) now supports the output of depth for all positions.
Thank you for this tool, as it served my purpose in a very short period of time. As I am new to bioinformatics, I am not clearly understanding the output file. I used the bed and bam files as input and got the result. What do the total depth, coverage, and mean depth mean, and how are they calculated? It will be helpful if I can know this.
Thank you very much for using PanDepth. PanDepth is a tool that calculates the coverage of alignment regions by extracting chromosome names, alignment start positions, and CIGAR tags from alignment files, and then merges the coverage information of each extracted read to obtain the final output.
"Total depth" refers to the sum of sequencing depths for all bases at a given region.
"Coverage" represents the proportion of at least one sequencing read covering a genome or specific region. It is typically expressed as a percentage; for example, a coverage of 95% at a position indicates that the sequencing reads cover 95% of that segment.
"Mean depth" denotes the average sequencing depth at each position within a specified region. It is calculated by dividing the total depth by the number of covered positions.