unable to produce cnr files
0
0
Entering edit mode
5 months ago
ww22runner ▴ 60

Hello,

I am currently using the latest docker image of cnvkit to run:

cnvkit.py batch $CNV_BAMS/*_T.bam \
        --normal $CNV_BAMS/*_G.bam \
        --targets $BEDFILE_0B_BAITS \
        --fasta $REF_GENOME_b37 \
        --access /data/access-5k-mappable.grch37.bed \
        --output-reference $CNV_BAMS/my_reference.cnn \
        --output-dir $CNV_T_RESULTS \
        --diagram \
        --scatter \
        -p 8 \
        --cluster

However, while for some runs cnvkit runs to completion, for some runs, the log terminates after the cnn file is produced.

Percent reads in regions: 92.708 (of 13203032 mapped)
Wrote sample_G.targetcoverage.cnn with 9583 regions
Processing reads in sample_G.bam
Time: 4.344 seconds (0 reads/sec, 4394 bins/sec)
Summary: #bins=19087, #reads=1, mean=0.0001, min=0.0, max=1.89
Percent reads in regions: 0.000 (of 13203032 mapped)
Wrote sample_G.antitargetcoverage.cnn with 19087 regions
Processing target: sample_G
Keeping 8419 of 9583 bins
Correcting for GC bias...
Correcting for density bias...
Processing antitarget: sample_G
Keeping 1 of 19087 bins
Correcting for GC bias...
ALL DONE

Upon debugging, it looks like when the do_fix function is called on the anti-target file, the assert statement in the _width2wing function in smoothing.py fails and the program terminates without printing any error message. Interestingly, if I use a reference.cnn produced by another run (another set of normal samples), the cnr files are produced.

Usually, when I run cnvkit successfully, the anti-target keeps 0 of x bins and mentions that most bins have low coverage like this:

Processing target: sample_G
Keeping 8415 of 9583 bins
Correcting for GC bias...
Correcting for density bias...
Processing antitarget: sample_G
Keeping 0 of 19087 bins
WARNING: most bins have no or very low coverage; check that the right BED file was used
Correlations with each cluster:
        log2    : 0.9580718621239102
        log2_1  : 0.9575440273517433
        log2_4  : 0.954962901257862
        log2_2  : 0.764261436668614
        log2_3  : 0.7586570106339138
 -> Choosing columns 'log2' and 'spread'
Wrote sample_G.cnr with 8415 regions

I am unable to debug beyond this point and would appreciate any advice! Thank you.

cnvkit • 212 views
ADD COMMENT

Login before adding your answer.

Traffic: 1540 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6