How to calculate the coverage of list of genes in whole exome data
2
0
Entering edit mode
19 months ago
anii • 0

Hi all,

I have bam file of whole exome human data. Now I want to check coverage of a list of genes in my data. Basically I want get output like this:

Gene    Percentage of coding region covered
A1BG               100 %
A1BG-AS1           100%
A1CF                99.94%
NEK8                100%
FANCI              100%
A2ML1             100%
bam gene coverage • 2.3k views
ADD COMMENT
0
Entering edit mode

If this data is using a specific kit then the manufacturer may have the BED files available for the regions that are covered by the kit. You should use that file.

ADD REPLY
0
Entering edit mode

Yes, they used the Agilent kit and i have downloaded the bed file from the agilent site. should i use this bed file or should i get the file from the manufacture?

ADD REPLY
0
Entering edit mode

Use the agilent file. Make sure it is for the genome build you are using otherwise things would be totally off.

ADD REPLY
0
Entering edit mode

Okay , Thakyou GenoMax

ADD REPLY
1
Entering edit mode
19 months ago
Shred ★ 1.5k

Create a BED file of your genes of interests and then use mosdepth as described https://github.com/brentp/mosdepth#exome-example

ADD COMMENT
0
Entering edit mode

shred can you please elaborate the steps to create the BED file of genes of interests.

ADD REPLY
0
Entering edit mode

use the UCSC table browser. You can upload a list of identifiers (refseq IDs, I believe) and it will spit out a bed file of the corresponding coordinates.

ADD REPLY
0
Entering edit mode

A BED file is a tab separated file which defines a region. It has 3 columns, chromosome, start, end.

You could generate using the method pointed by @Trivas or using BioMart https://www.ensembl.org/biomart/martview

https://en.wikipedia.org/wiki/BED_(file_format)

ADD REPLY
0
Entering edit mode

I have get this output by using mosdepth. It give me coverage at 1X, 5X,10X,15X ,20X , 30X. In some regions genes are repeated as you can see in the example. Now how Can i convert the coverage in the form of percentage?

example:

#chrom  start   end region  id  1X  5X  10X 15X 20X 30X

chr1    69090   70008   OR4F5   918 918 918 918 918 918
chr1    134772  140566  LOC729737   1985    0   0   0   0   0
chr1    182387  184878  DDX11L17    2464    1214    860 550 412 273
chr1    187890  187958  MIR6859-1   68  68  68  68  68  68
chr1    1020119 1056114 AGRN    22418   15841   14836   14026   13286   12229
chr1    1020119 1056114 AGRN    22418   15841   14836   14026   13286   12229
chr1    1033992 1056114 AGRN    17233   14700   13867   13225   12695   11715
ADD REPLY
0
Entering edit mode

Never post screenshots here. Tabular data could be pasted using the code format. Which is the BED file for these regions?

ADD REPLY
0
Entering edit mode

I have used the bed file(hg38) from ucsc

ADD REPLY
0
Entering edit mode

I have get this output by using mosdepth. It give me coverage at 1X, 5X,10X,15X ,20X , 30X. In some regions genes are repeated as you can see in the example. Now how Can i convert the coverage in the form of percentage?

example:

chrom          start         end             region              1X 5X 10X 15X 20X 30X
chr1             69090    70008        OR4F5               918 918 918 918 918 918 
chr1          134772    140566       LOC729737        1985 0 0 0 0 0 
chr1       182387       184878        DDX11L17         2464 1214 860 550 412 273 
chr1       187890         187958        MIR6859-1       68 68 68 68 68 68 
chr1         1020119   1056114         AGRN              22418    15841 14836 14026 13286 12229 
chr1       1020119     1056114        AGRN              22418    15841 14836 14026 13286 12229 
chr1      1033992      1056114        AGRN                17233 14700 13867 13225 12695 11715
ADD REPLY
0
Entering edit mode
ADD REPLY
0
Entering edit mode
ADD COMMENT

Login before adding your answer.

Traffic: 2440 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6