Dear Community,
i would like to ask a more general question about exome sequencing, and mutation analysis. I have mainly analyzed transcriptomics data, so excuse me for any naive questions on this matter. I have some preliminary exome sequencing data (both fastq and bam files) from 3 patients:
in detail, i have for the same patients both whole exome sequencing data from CTCs (circulating tumor cells) and also exome sequencing data from biopsies of the same patients.
The main goal idea, is to find/identify if there are common mutational patterns (ie.SNPs) between circulating tumor cells and biopsies, which would be very vital for the time of diagnosis of the specific cancer, as also for the relative biological mechanisms, etc.
Perhaps an major issue is that there is no reference normal tissue (that probably limits the identification of somatic variants), as also the small number of patients-however, any ideas or suggestions about which tools or approaches i could implement for a putative variant annotation, or comparison, for this specific scenario ?
Thank you in advance,
Efstathios-Iason
About the somatic variants: you can annotate with ExAC (or other population databases) to remove "putative germline" from your analysis. There are plenty of suggestions on Biostars already. Second, depending on your scientific question, it might be sufficient to use the "tumor" as "normal" because you are particularly interested in the additional mutations. For example, a non-metastatic cancer become more likely to build metastatis after aquiring a SMAD4 mutation. If you are only interested in PrimaryTumor vs. CTC variant differences, this might be a possible way to go?
Dear Manuel,
thank you for your answer and suggestions-i will search in the forum for similar threads-actually, because both biopsies and circulating tumor cells were isolated from the same time of diagnosis-where the tumor has already spread due to its specific nature, so it is not definately primary in both-the main interest is to find common mutational patterns-if they could be identified due to general tumor heterogeneity- between these two types of samples
I agree, in that case you should not use tumor as normal. But than I would still go forward with calling mutations on all samples and removing everything found in population databases with decent frequency.