Entering edit mode
3.1 years ago
rebeliscu
▴
60
I'm trying to calculate the depth of coverage from my WXS data.
Using Qualimap, I first used as the feature file the Gencode human genome (release 38) .gtf file associated with the genome I aligned to:
feat="gencode.v38.primary_assembly.annotation.gtf"
for ea in *bam
do $qualimap bamqc \
--java-mem-size=20G \
-bam $ea \
--feature-file $feat
done;
... and the coverage was ~6.8.
I then subset the .gtf file to exonic regions only:
grep 'transcript_type "protein_coding"' gencode.v38.primary_assembly.annotation.gtf |
awk '($3=="exon") {printf("%s\t%s\t%s\n",$1,int($4)-1,$5);}' |
sort -T . -k1,1 -k2,2n |
bedtools merge |
awk 'BEGIN{OFS="\t"}{print $1,$2,$3,$4,0,"."}' \
> gencode.v38.primary_assembly.annotation_exome.bed
feat="gencode.v38.primary_assembly.annotation_exome.bed"
for ea in *bam
do $qualimap bamqc \
-bam $ea \
--feature-file $feat
done;
...and the coverage jumped to ~68.
Is the latter the correct coverage given that my target is the human exome?
Thanks in advance.