Coverage/depth of merged bam files
0
0
Entering edit mode
2 days ago
slzr_ • 0

Hi,

I am new to bioinformatics and I am currently working with NGS data on a panel with almost 50 genes and I am analyzing this in around 100 samples.

I want to see an average coverage/depth of the sequencing for each gene. What I did was first merge the bam files for every sample, and then verify the coverage using samtools depth and indicating the regions of the genes (start and end) with a bed file.

Is the correct/best approach to see the coverage for all these samples? Because I don’t want to see the coverage for each sample but for each gene.

samtools • 259 views
ADD COMMENT
0
Entering edit mode

As long as all samples were aligned against the same reference then it should be fine to do what you did.

ADD REPLY
0
Entering edit mode

All samples were aligned against the same reference. I was afraid that by doing this I would be losing some kind of information, but thank you for the reply!

ADD REPLY
0
Entering edit mode

Merging is not changing the files in any way so no you should not lose information. If your BAM files did not have read groups then you may not be able to tell which alignment record came from which sample after you merge the data.

ADD REPLY
0
Entering edit mode

Thank you! I wrote another post but I can write it for you too so maybe you can help me.

Initially, as I mentioned here, I merged the .bam files and utilized the samtools depth command to calculate the coverage. Then, I used a .bed file containing gene coordinates to identify the coverage for each gene. However, I later became concerned about whether this approach was appropriate, given that all samples were aligned to the same reference.

As an alternative, I calculated the coverage for each sample individually and then computed the average base coverage for each gene. So this left me with one file containing chromossome, position (per base), coverage and gene id. However, the results were significantly different and much lower compared to the previous approach. Now I'm confused on which one is more correct. Sorry if this is a lot of information, I am really new to this and am trying to figure it out how I can visualize how was the coverage on this sequencing.

ADD REPLY
0
Entering edit mode

However, the results were significantly different and much lower compared to the previous approach.

I responded in the other thread so let us keep the discussion there.

ADD REPLY

Login before adding your answer.

Traffic: 1524 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6