Hello, So, currently I want to get variation in the sequence for each individual sample (human cancer sample). I want to compare whether there are any variation among the cancer sample. What I mean is if I compare one cancer sample with others, I want to see if there are any different variation occur.
Can I use mpileup in samtools to call variation for a single sample and then compare the result after that?
I tried to use samtools for all samples in one go but it gives only one list of variation (VCF file). I think that VCF is the common variation occur in cancer sample.
Snippet of vcf result : https://docs.google.com/spreadsheets/d/1TbGlLLjSKoVlI5YtPt3ujyDjKG8I_UKQDj5u22kYYoU/edit?usp=sharing
Could you please explain your input and desired output with example snippets of files? Thank you
Well, basically the VCF file is what I needed. I just want to know the variation that an individual has. Let's say in chomosome X position N, individual A has SNP G with reference C. I want to compare if individual B,C,D also has that SNP or not. A,B,C,D are all cancer sample. It is really simple I think. I just want to know the whether using samtools mpileup will produce good result if only a single bam file is given. I think samtools and bcftools try to calculate some statistic based on the average across samples.
Depends on whether you are looking only for SNPs, what specificity and sensitivity you want, your ability to pay for software and for lab/chemistry optimization.
samtools on multiple bam files in order to make multisample vcf is a very good starting point to understand object you are working with.