I am a bit confused as I lately came across the paper from Alkodsi et al. (https://www.ncbi.nlm.nih.gov/pubmed/24599115) where there is reported that ADTEx (Aberration detection in Tumor Exome) uses CBS (Circular Binary Segmentation) for segmentation. So far I thought the segmentation was done using a HMM only - at least not using BAF (B allele) information - but a peek into the code confirmed the usage of DNAcopy (implementation of CBS) in the part of the code that should also run when not using BAF as input.
As I am rather new to R, I unfortunately wasn't able to understand what DNAcopy is used for in context of ADTEx and thus I would highly appreciate if someone could explain a bit or make a guess about what is going on. And most important if CBS is used in ADTEx when no BAF information is provided and if so what for CBS is used? Is it maybe just used to merge the states predicted by Viterbi into larger regions?
Perhaps this is trivial for people for familiar with the subject, but for the rest (including myself) most of the abbreviations in your post are meaningless and should be avoided.
@WouterDeCoster: In principal you are you of course right, but I assume that anyway only people familiar with the field - or more precise with ADTEx - may be able to provide a good guess or answer to my quite specific question. That's why I do not really see the need to avoid these abbreviations ... Anyway sorry for the exclusion by using them. Else I hope my question is clear.
Logarithmic values of the tumor sample RCs are normalized by library
size, GC content and amplicon length. For further analysis, we keep
residuals of the linear regression of the tumor NRCs over the baseline
calculated for the control samples.
The resulting profiles are segmented using the circular binary
segmentation (CBS) method [R packages PSCBS and DNAcopy (Olshen et
al., 2004; Venkatraman and Olshen, 2007)].
And RC stands for read count. Based on this alone, I think what you have at this point are essentially log ratios. Regarding your second question, I do not think circular binary segmentation has to be exclusively used with logratio values. While that is the original application for which CBS was developed, the principles can hold true for other contexts. I have successfully applied it in past to segmenting real valued traces of data varying in a finite range.
Perhaps this is trivial for people for familiar with the subject, but for the rest (including myself) most of the abbreviations in your post are meaningless and should be avoided.
@WouterDeCoster: In principal you are you of course right, but I assume that anyway only people familiar with the field - or more precise with ADTEx - may be able to provide a good guess or answer to my quite specific question. That's why I do not really see the need to avoid these abbreviations ... Anyway sorry for the exclusion by using them. Else I hope my question is clear.