I have 30 samples to call SNPs against same reference. All samples were aligned to reference using bowtie2 to get separate sam files. Those were used in calling.
What is the best way to call SNP and filter:
Call SNP together like:
samtools mpileup -uf genome.fa sam1.bam sam2.bam ....sam30.bam | bcftools view -I -bvcg -> all30.bcf
bcftools view all30.bcf | vcfutils.pl varFilter -d 10 > all30.vcf
I found that even without any read coverage vcf file had genotype exactly as in reference if one of the sample had SNP in that particular location.
If there is SNP, I want coverage in all samples or maybe 1 missing.
29 samples should have coverage of 10 and if Het: minor allele frequency of 30% (i.e 3).
Thanks in advance
So you have already got the bam file of the 30 samples, right?
Could you please check if the all the bam file has a unique read group id? If they have the same read group, then all the bam file will be considered as the same sample, therefore leading to your situation. You can use the picard tools to change the read group, see if that will help.
I did alignment of each samples separately to reference to get separate sam files and used downstream. Sorry, how do I check that.
This is the vcf file: