hi all,
i am trying to develop a workflow for detecting cancer CNVs and had some questions. First off, is there a fairly established workflow like the GATK best practices that is used.
i looked at the CN.mops and it produces really nice CNV plots but the text data is not so easy to comprehend, besides it does not use normal samples.
i would really appreciate if someone could provide some pointers in this direction.
First of all, Copy number Variation (CNV) are copy number variations present in normal cells. The ones you see in cancer cells are called Copy Number Aberrations (CNA).
SNP6.0 array is probably the best platform to detect copy number aberrations. ASCAT is a very good software to detect copy number aberrations also using normal samples. They have a very explicit workflow on their website:
We've started using CNVkit recently for our cancer CNV detection in NGS results workflow with good results:
CNVkit docs
It has both plots & text data and the documentation is pretty comprehensive. You can do tumor vs. normal or just tumor analysis. The author, Eric Talevich, is around this forum and answers questions.
So there are certain things I would like to clarify here, when you say as cancer CNV workflow are you intending to work on whole genome or whole exome. Both have its own pros and cons but both the workflows needs some tweaking to give the highest confident hits. Here are my pointers:
1) If you are working with WGS/WES try to use GATK to process the alignment files and the final alignment files can be then subjected to any CNV tools downstream.
2) If you are looking for Somatic CNV then you have to use the Normal/Tumor samples processed with GATK and use a tool that works on fishing out high confidence somatic CNVs (for WGS/WES somatic CNVs there are tools like ADtex, ExomeCNV, Control-FREEC, you can always make a wrapper function that can parse the processed normal/tumor bam files directly to the above mentioned tools to fish out the CNVs and produce the plots.
3) There is also VarScan2 which uses circular binary segmentation and produces somatic CNVs (but I never had convincing results with it so I do not mention it much but obviously it might give great results for others)
I am stating the above keeping in mind that you are trying to detect CNVs from WGS/WES data. I have never done CNV detection from RNA-Seq so cannot comment , there are some papers which also did CNV detection from methylation sequencing data but I do not have any experience so I rest my pointers here. Good luck!
Thanks, both the tools SNP6.0 and ASCAT seem to be inclined to work with microarray data whereas we will likely have NGS data.