Regions to Exclude and useSexChromosome questions
1
0
Entering edit mode
7.8 years ago

Hi, I have two quick questions about SciClone:

  1. What exactly should be the content of the 'Regions to Exclude' file? When I parse my VCF file and get the appropriate file with VAF information that SciClone can handle, I understood that I want to exclude copy-number-neutral LOH regions, or rather I want to exclude all homozygous mutations. I first figured to just parse my VAF file and exclude lines with mutations where VAF is higher than some threshold (70 or whatever), but it would still be nice to have correct copy-number segmentation by copu-number regions, that SciClone currently does (by this I mean dividing all VAFs into regions with copy numbers 1, 2, 3,...). But if I just remove lines with VAFs higher than 70, then I'll lose the information about those mutations, which I understand that I do not need, since SciClone looks only at copy-number neutral regions, I'd just like to see those VAFs in their respective regions plotted and not lost all together. So i understood that the 'Regions to Exclude' input is meant for that. My question is what should be in that input? Do I just parse my VAF file, see where VAFs are higher than 70 (for example) and make a new file with regions that include those mutations (since in the main VAF file that SciClone takes as input, only the starting coordinate is present, I suppose making a file with coordinates +-1 of the mutations in question is ok, if this really is the case what I should do?)? And in this case, does SciClone first do the copy-number analysis (meaning cluster the mutations into groups per copy-number) and then excludes the regions from the 'Regions to Exclude' file? Or is this file obtained in a different way?

  2. If I set the useSexChromosomes option to False, does this work with chromosome naming >chrX and >chrY, or just >X >Y? As I see in the source code, the former is not the case, though I'm not a big R user so I'm not sure if I read the code correctly. Also I figured that most of the times I'd set this to False, is there a particular reason (or cases) why this is set to True as the default value?

Thanks for any answers.

sciclone • 2.3k views
ADD COMMENT
0
Entering edit mode
7.8 years ago

What exactly should be the content of the 'Regions to Exclude' file?

Yes, The usual use is to exclude CN-neutral LOH regions. You should actually call LOH regions in a similar manner to calling copy-number calls. I don't have a script handy at the moment, but the short version is, run Varscan on the tumor/normal bams, extract the "Germline or "LOH" calls, then segment with the DNAcopy package for R. That will give you discrete regions of LOH (for example "17 123456 765432")

If I set the useSexChromosomes option to False, does this work with chromosome naming >chrX and >chrY, or just >X >Y?

Right, it assumes X and Y, no chr prefixes

Also I figured that most of the times I'd set this to False, is there a particular reason (or cases) why this is set to True as the default value?

If you're working with a male patient, you definitely want it to be true because those are expected to be haploid regions (absent chr duplication or Klinefelter Syndrom). If female, it's fine.

ADD COMMENT

Login before adding your answer.

Traffic: 1970 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6