Hi, I have variant calling pipeline for somatic mutations. I am trying to uncover mutations on low allele balance frequencies.
When I find mutation in particular sample, I would like to see if it is present in other samples. Unfortunately just comparing vcf files does not do the trick because of many possible filtering steps or lack of some calls. Thus for given signature of the mutation from vcf file I would like to obtain info regarding this particular mutation in number of bam files. In particular I am interested the most what is the coverage at the position of the mutation and what is the number of reads supporting the mutation.
For example I have 1.bam, 2.bam, 3.bam. Based on particular criteria (allele balance > 0.1, coverage > 500, allele observations > 10) i produce 1.vcf, 2.vcf, 3.vcf. Now when there is some variant in 1.bam based on these criteria, then in 2.bam there can be this mutation with allele balance 0.09, coverage 1000 and 9 allele observations (and it does not make it to 2.vcf) while in 3.bam there is 0 coverage and no allele observation. I would like to somehow extract this info about signatures of the mutations in 1.vcf in files 2.bam and 3.bam. What tool should I use and is there one?
Thanks, Vojtěch.
Thanks Pierre, at least for SNP detection this is a good solution. Unfortunately it does work only for my unprocessed bam files. After processing them further by picard/GATK there is error
I can send you my bam file since I have no clue what is wrong. When i use samtools view it normally shows the reads.
Kind regards, Vojtech.
Please use `
ADD COMMENT
to reply to earlier answers, as such this thread remains logically structured and easy to follow.I suspect there is something wrong in your bam . Test your files with https://broadinstitute.github.io/picard/command-line-overview.html#ValidateSamFile or / and https://broadinstitute.github.io/picard/command-line-overview.html#CheckTerminatorBlock
Well, I have plenty of errors of the type
Problem is that GATK tools did some kind of filtering and removed some reads without their mates. So I probably need to do that manually. Or is there some tool to do this?
my tool would ignore this kind of problem, please run CheckTerminatorBlock too.
Well, thank you I found the problem. There was files together with file.bam in directory
Index file file.bam.bai contained up to date index while the file.bai contained outdated index. I needed to remove incorrect file.bai and then the program works with correct index file.bam.bai,
thank you Pierre.