Entering edit mode
7.3 years ago
kulvait
▴
270
Hi, I have bunch of samples from cancer patients amplicon sequencing. We are trying to identify somatic mutations related to those.
I would like to perform check for sample swaps in these samples since we have typically more than one sample per patient. What comes into my mind is to produce VCF files, filter them such that it contains only dbSNP records with genotype calls 0/1 or 1/1 and cluster the results.
Is there some tool to do this or I have to do everything manually?
Thanks Vojtech.
Hi, One quick way but not for a serious clinical settings is to feed the vcf to vcfkit (here ) and make a tree out of it. But I don't exactly know the implications for your work.
We gave students of our course some paternity testing exercises based on SNPs and they came up with the idea to make a tree, which worked quite robustly for small regions and non-filtered SNP calls. The command being
what do you mean by "sample swap"? you wan to check if all samples comes from the same patient?
Yes, exactly. I want to check if the samples that are labeled by single person have very similar genetic profiles.
To do that I would do a PCA based on the snp calling, the samples belonging to same patient should cluster together.
It would need a genotype calling based on all the samples to get a multi-vcf (for example with gatk pipeline)
Then you can do a PCA with the multi-vcf, it can be achieved with R SNPRelate package. (there are maybe easier and better solutions...)