How to extract read counts at the mutation locations
2
0
Entering edit mode
17 months ago

I have a scDNAseq dataset having multiple FASTQ files for multiple single cells.

samtools was used after aligning FASTQ files with BWA to hg19 reference to produce bam files.

I have already identified 36 SNV mutation sites and

I want to use mpileup to extract read counts (Total read count and variant read count) at those 36 positions.

How do I do that? Or Which mpileup command should I use to achieve that?

PS: what I already did so far:

For TP53-chr17-7577548 variant position, I searched one of the single cell bam files using grep:

bcftools view SRR3472634_CO8-MA-91.bcf | grep "chr17" | grep "7577548"

And that resulted in this variant is present:

#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT SRR3472634.sorted.bam
chr17 7577548 . C T 225 . DP=73;VDB=0.154722;SGB=-0.693147;MQSB=1;MQ0F=0;AC=2;AN=2;DP4=0,0,28,30;MQ=60 GT:PL 1/1:255,175,0

I used the following command for TP53-chr17-7577548 variant position with mpileup command:

samtools mpileup -f reference.fa -r chr17:7577548-7577548 alignments.bam

And that resulted:

[mpileup] 1 samples in 1 input files chr17 7577548 C 58 TtttTTTtTTTTtttTtttTTtttTTTtTtttTTTtTtTTtTTTtttTttTtTttTtt uJ<JJuFKKuJsFKKuJJJA<JKAJFKAKJJKFKKFJJKKJuuKJFFpJKAJJKJJKF

Then I used the following for more precise:

samtools mpileup -f reference.fa -r chr17:7577548-7577548 alignments.bam | cut -f 5 | tr '[a-z]' '[A-Z]' | fold -w 1 | sort | uniq -c

And that resulted:

[mpileup] 1 samples in 1 input files
58 T

I think that showed me 58 alternate read counts of T

bcftools variant-calling samtools mpileup • 1.1k views
ADD COMMENT
3
Entering edit mode
17 months ago

using https://jvarkit.readthedocs.io/en/latest/FindAllCoverageAtPosition/

echo "src/test/resources/S1.bam" | java -jar dist/jvarkit.jar findallcoverageatposition -p 'RF08:926' | cut -f2,3,5,15-19

CHROM   POS DEPTH   Base(A) Base(C) Base(G) Base(T) Base(N)
RF08    926 11  9   0   0   2   0
ADD COMMENT
3
Entering edit mode
17 months ago

You can also use perbase, which I personally like a lot, because one can directly use variant calls to filter the regions of interest:

perbase base-depth --bcf-file SRR3472634_CO8-MA-91.bcf alignments.bam
ADD COMMENT

Login before adding your answer.

Traffic: 1611 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6