Hi all,
I have several thousands sorted BAM files and I would like to quickly extract the consensus sequence from all the reads on a specific position. Simply using samtools view provides all the reads (including those jumpoing the position due to splicing). I could not make it work via mpileup. Here is the single file call I have been working on:
samtools view -b file.bam "chr2:5666012" | samtools mpileup -
The ideal output for a single file should be (with G consensus):
....AAAACCTT
However, I get huge outputs as also the spliced reads are employed in the pileup generation.
Any idea is welcome. Thanks! :-)
Hello,
to get the consensus sequence you first have to call variants (mpileup isn't enough) and then create the the consensus sequence with the help of your vcf file and the reference sequence. See here: Generating consensus sequence from bam file
fin swimmer