Entering edit mode
5.3 years ago
user31888
▴
150
Hi,
Is there a tool to detect copy number variants on a single target sequenced sample file?
All the tools I know (GATK, ControlFREEC, VarScan) require a mate file.
Thanks !
What have you got? - a single FASTQ file relating to single-end next generation sequencing?
I have a BAM file from which I obtained 2 vcf files after calling germline and somatic variants using GATK.
It seems that
bcftools cnv
could sort me out. I need to add to my vcf, the B-allele frequency (I can compute it from the AD field), and the LRR (don't know what it is yet),bcftools cnv
does not seem to work since we need to have 2 samples to compute LRR (Log R Ratio).ControlFREEC can calculate copy number from a single sample, but what is your sample? - whole genome sequencing, exome sequencing, or target sequencing?
It is target sequencing.
ControlFREEC can call copy number for this data. Here is a sample config file that you'll need for this type of data:
Thanks Kevin ! The program generates the
sample.cnp
andGC_profile.cnp
files, but then stops with the following error:I tried to decrease the window size to 500 but I got the same error. Note, I am only interested in one chromosome (my BAM contains only reads mapped to one chromosome). I supplied a
captureRegions
path though. Do you have an idea what parameters should I adjust?How large is your target region?
It is about 10e6 bp.
You may need to add a new chunk of code to your config file, like this:
Also, with that, set
window=0
in the[general]
chunk.Be aware of the other option,
readCountThreshold=10
, which you may need to toggle. 10 is already low, though.I also decreased the following parameters and still get the same error:
Alternatively, I need to calculate the approximative copy number (and exon involved) of a specific gene. Would it be possible to do it "manually" from a BAM or VCF file?
Realistically, I am not confident that it is possible to accurately determine copy number for just one gene - how would you know the level of coverage that is reflective of normal, deleted, or amplified DNA, when read coverage, even in a 'normal' piece of DNA, fluctuates a lot based on GC content, sequence similarity elsewhere in the genome, etc? Even when we have genome-wide data, copy number callers disagree a lot.
Why can't you just pick a few exons, design some primers, obtain a normal reference hgDNA, and then do qPCR (determine copy number via the delta delta Ct method).
Good idea. I will give qPCR a try. Thanks !