I want to use the R package SciClone to infer the clonal architecture of my liver cancer data. I have already got the somatic SNV info and allelic specific copy number variation info, but I am not clear about some input info required by SciClone: 1. What does the "regions to exclude" mean? My understanding is: if a region has total_cn = 2, but major_cn = 2 and minor_cn = 0, then I need to write this region in the "regions to exclude" file. Is this understanding correct? 2. In my opinion, the vaf means var_reads/(var_reads + ref_reads). However, when I used ?sciClone in R, it tolds me:
vafs
a list of dataframes containing variant allele fraction data for single nucleotide variants in 5-column format: 1. chromosome 2. position 3. reference-supporting read counts 4. variant-supporting read counts 5. variant allele fraction (between 0-100)
Why is the range of vaf between 0-100? (for me, it must be less than 1). What is the definition of this "vaf"?
Could anyone give me some advice? Thanks!
Yang
"the vaf means var_reads/(var_reads + ref_reads)" - Is it right? In example test case vaf always less then var_reads/(var_reads + ref_reads). Why is that? Is this affected by the admixture of normal cells in sample? Does this difference greatly affect clustering? Many thanks! Alex.
There may sometimes be non-reference, non-primary-variant reads - for example:
1 12345 A T Counts: A: 99 C: 2 G: 0 T: 99 VAF: 49.5%