Bam-readcount output empty from certain regions (RNA seq bams)
2
0
Entering edit mode
7.7 years ago

Hi all,

I am using bam-readcount to find reads in a RNA seq bam at the location of variants found using RNA seq. However, I am not able to get the right output from all regions of my bam-file. For example:

I am working with mm10, and I use the following command:

bam-readcount -w 1 -b 20 -q 1 -f /mnt/storage/ref_fasta/Mus_musculus.GRCm38.dna.primary_assembly.fa 4T1_0Gy_0624.Aligned.sortedByCoord.out.bam [chr:start-stop]

1) For my C>T mutation at chromosome 4:152297391 it all seem to work fine.

bam-readcount -w 1 -b 20 -q 1 -f /mnt/storage/ref_fasta/Mus_musculus.GRCm38.dna.primary_assembly.fa 4T1_0Gy_0624.Aligned.sortedByCoord.out.bam 4:152297391-152297391
Minimum mapping quality is set to 1
WARNING: In read ST-K00128:117:HG5FKBBXX:6:1124:3336:48843: Couldn't find single-end mapping quality. Check to see if the SM tag is in BAM.
The previous warning has been emitted 1 times and will be disabled.
WARNING: In read ST-K00128:117:HG5FKBBXX:6:1124:3336:48843: Couldn't find number of mismatches. Check to see if the NM tag is in BAM.
The previous warning has been emitted 1 times and will be disabled.
4   152297391   G   7   =:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00  A:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00  C:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00  G:7:255.00:39.71:0.00:4:3:0.35:0.00:11.71:4:0.39:100.43:0.54    T:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00  N:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00

2) This is a location that I have 0 reads for in my RNA seq file (chr4:147058427, A>G):

bam-readcount -w 1 -b 20 -q 1 -f /mnt/storage/ref_fasta/Mus_musculus.GRCm38.dna.primary_assembly.fa 4T1_0Gy_0624.Aligned.sortedByCoord.out.bam 4:147058427-147058427
Minimum mapping quality is set to 1
4   147058427   C   0   =:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00  A:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00  C:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00  G:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00  T:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00  N:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00

3) Chromosome 4: 123189043 has a C>T mutation, however, bam-readcount will return nothing at all on this location:

bam-readcount -w 1 -b 20 -q 1 -f /mnt/storage/ref_fasta/Mus_musculus.GRCm38.dna.primary_assembly.fa 4T1_0Gy_0624.Aligned.sortedByCoord.out.bam 4:123189043-123189043
Minimum mapping quality is set to 1

How come i sometimes get 0 reads and sometimes empty output? I want to make sure that when I have a NA output, I really do not have any reads for this location? Is this a problem with my reference fasta file?

Thanks, Nils

RNA-Seq • 2.8k views
ADD COMMENT
2
Entering edit mode
7.7 years ago

So, I think I have figured it out by using IGV.

1) When i have reads that with good mapping quality and base quality, then I get these with bam-readcount.

2) When I have reads but these are not of good quality bam-readcount returns zeros.

3) When I do not have any reads at all for a certain location, bam-readcount returns nothing.

Is this a correct interpretation?

Thanks, Nils

ADD COMMENT
0
Entering edit mode
7.6 years ago

Yes, if there is no coverage of your variant, then bam-readcount will return nothing for that site. This includes sites for which there are no reads that meet your filtering criteria.

ADD COMMENT
0
Entering edit mode

Hi, thanks for your response.

Perhaps I missunderstand you when you say bam-readcount will return nothing for sites that do not pass my filtering criteria. How about this site? No reads here match my filtering criteria, but I still get 0s:

bam-readcount -w 1 -b 20 -q 1 -f /mnt/storage/ref_fasta/Mus_musculus.GRCm38.dna.primary_assembly.fa 4T1_0Gy_0624.Aligned.sortedByCoord.out.bam 4:147058427-147058427
Minimum mapping quality is set to 1
4   147058427   C   0   =:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00  A:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00  C:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00  G:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00  T:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00  N:0:0.00:0.00:0.00:0:0:0.00:0.00:0.00:0:0.00:0.00:0.00
ADD REPLY
0
Entering edit mode

The details are fuzzy in my mind at the moment, but I suspect there is a difference between sites with coverage, where all reads fail quality checks (0) vs sites without any coverage (not output).

ADD REPLY

Login before adding your answer.

Traffic: 1673 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6