Question

How to extract read counts at the mutation locations

0

Entering edit mode

23 months ago

supernovamik • 0

I have a scDNAseq dataset having multiple FASTQ files for multiple single cells.

samtools was used after aligning FASTQ files with BWA to hg19 reference to produce bam files.

I have already identified 36 SNV mutation sites and

I want to use mpileup to extract read counts (Total read count and variant read count) at those 36 positions.

How do I do that? Or Which mpileup command should I use to achieve that?

PS: what I already did so far:

For TP53-chr17-7577548 variant position, I searched one of the single cell bam files using grep:

bcftools view SRR3472634_CO8-MA-91.bcf | grep "chr17" | grep "7577548"

And that resulted in this variant is present:

#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT SRR3472634.sorted.bam
chr17 7577548 . C T 225 . DP=73;VDB=0.154722;SGB=-0.693147;MQSB=1;MQ0F=0;AC=2;AN=2;DP4=0,0,28,30;MQ=60 GT:PL 1/1:255,175,0

I used the following command for TP53-chr17-7577548 variant position with mpileup command:

samtools mpileup -f reference.fa -r chr17:7577548-7577548 alignments.bam

And that resulted:

[mpileup] 1 samples in 1 input files chr17 7577548 C 58 TtttTTTtTTTTtttTtttTTtttTTTtTtttTTTtTtTTtTTTtttTttTtTttTtt uJ<JJuFKKuJsFKKuJJJA<JKAJFKAKJJKFKKFJJKKJuuKJFFpJKAJJKJJKF

Then I used the following for more precise:

samtools mpileup -f reference.fa -r chr17:7577548-7577548 alignments.bam | cut -f 5 | tr '[a-z]' '[A-Z]' | fold -w 1 | sort | uniq -c

And that resulted:

[mpileup] 1 samples in 1 input files
58 T

I think that showed me 58 alternate read counts of T

bcftools variant-calling samtools mpileup • 1.4k views

ADD COMMENT • link updated 23 months ago by Ram 45k • written 23 months ago by supernovamik • 0

0

Entering edit mode

see also : How to generate alignment report from bam files , Finding SNPs in population data: help with pipeline , How to retrieve the SNP data from Bam file

ADD REPLY • link 23 months ago by Pierre Lindenbaum 166k

score 3 · Answer 1 · 2023-06-14

3

Entering edit mode

23 months ago

Pierre Lindenbaum 166k

using https://jvarkit.readthedocs.io/en/latest/FindAllCoverageAtPosition/

echo "src/test/resources/S1.bam" | java -jar dist/jvarkit.jar findallcoverageatposition -p 'RF08:926' | cut -f2,3,5,15-19

CHROM   POS DEPTH   Base(A) Base(C) Base(G) Base(T) Base(N)
RF08    926 11  9   0   0   2   0

ADD COMMENT • link 23 months ago by Pierre Lindenbaum 166k

score 3 · Answer 2 · 2023-06-14

3

Entering edit mode

23 months ago

Matthias Zepper 5.1k

You can also use perbase, which I personally like a lot, because one can directly use variant calls to filter the regions of interest:

perbase base-depth --bcf-file SRR3472634_CO8-MA-91.bcf alignments.bam

ADD COMMENT • link 23 months ago by Matthias Zepper 5.1k