Coverage NGS reads
2
0
Entering edit mode
3.7 years ago
Sabeen ▴ 30

Hello, I would be really thankful if someone can help me.

I am trying to calculate coverage for my paired end reads. I want to calculate percent coverage per gene , per exon and per exome. i want it like following.( just an example)

**coverage**                     **percentage of analysable target bases with** 
at least 1 read                        98
at least 5 reads                      97 
at least 50 reads                    96

and so one

I have used the samtools depth function samtools depth -a inputfile.bam -o outputfile

it is giving me a file very big (32GB and more) in the following form

chr1    1   0
chr1    2   0
chr1    3   0
chr1    4   0
chr1    5   0
chr1    6   0
chr1    7   0
chr1    8   0
chr1    9   0
chr1    10  0
chr1    11  0
chr1    12  0
chr1    13  0
chr1    14  0
chr1    15  0
chr1    16  0
chr1    17  0
chr1    18  0
chr1    19  0
chr1    20  0
chr1    21  0
chr1    22  0
chr1    23  0
chr1    24  0
chr1    25  0
chr1    26  0
chr1    27  0
chr1    28  0

I also tried GATK but couldn't use it as its still in its beta stage. looking forward for some guidelines.

Regards, sabeen

NGS samtools coverage • 2.4k views
ADD COMMENT
0
Entering edit mode

Did you go through these: ?

How to calculate read depth and coverage for whole exome sequencing?

Calculating Exome Coverage

Calculate Per Exon/Per Gene Coverage

The samtools output is normal and expected, it is the per-base coverage for every single nucleotide in the genome.

ADD REPLY
0
Entering edit mode

Thanks everyone for the help. Unfortunately i am still not able to solve my problems. I tried the samtools coverage as well as bedtools geneomecov. following is the code i used and the screen shot of some lines of result. looks like samtools coverage is giving me coverage per chromosome. however i need it per exon or per gene or per genome. in 1X , 2X .....100X form.

samtools coverage input.bam enter image description here

I also tried samtools bedcov following is the result chromosome wise . and not percentage coverage. samtools bedcov region.bed input.bam enter image description here

Also i used bedtools genomecov

bedtools genomecov -i input.bed -g hg19.fa 

(inplace of .genome i used hg19.fa. plz guide how to get .genome file)

Error: The genome file /Users/sabeen/NGS/2humanbrca/hg19.fa has no valid entries (are you sure it's a 2-column bedtools genome file).

i also tried samtools stats and samtools depth. however not getting desire result form.

i would be really thankfull if anyone ca guide me further. i am really stuck at this point.

thanks in advance

reagards, sabeen

ADD REPLY
0
Entering edit mode

When using bedtools genomecov for your purposes, the genome file is not needed. This simple command will give you the sample coverage in bedgraph format, which you could later intersect with your features of interest:

bedtools genomecov -bga -ibam file.bam

Again, considering your initial question, I must insist in using mosdepth, as it is indeed intended to answer queries like the one you have in a very efficient manner.

ADD REPLY
0
Entering edit mode

Thanks every one for the help. finally able to get the result with all the help here.

ADD REPLY
0
Entering edit mode
3.7 years ago

Following up on ATpoint comment, read those, but as a quick pointer I believe

bedtools genomecov

will be the closest to what you need

https://bedtools.readthedocs.io/en/latest/content/tools/genomecov.html

ADD COMMENT
2
Entering edit mode

bedtools has always been my first stop when dealing with regions and coverages, but if the regions or features of interest (genes, exons,...) are described in bedfiles, the most efficient way I've found to do this is by using mosdepth, which not only is lightning fast, but also allows defining multiple thresholds in a single command:

mosdepth -x -Q 1 -t $THREADS -T 1,5,50 -b genes.bed $PREFIX $BAMFILE
mosdepth -x -Q 1 -t $THREADS -T 1,5,50 -b exons.bed $PREFIX $BAMFILE
ADD REPLY
0
Entering edit mode
3.7 years ago

Dear Sabeen,

you are outputting everything including the ''zero covered'' region, I do agree with Istvan in regards to bedtools. if you want to do it with samtools I highly recommend excluding those with zero-depth. I believe using samtools bedcov will fit your purpose. see http://www.htslib.org/doc/samtools.html

ADD COMMENT

Login before adding your answer.

Traffic: 1963 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6