Question

Varscan, Using The Copycaller

1

Entering edit mode

13.3 years ago

Mdeng ▴ 530

Hi Folks,

After reading this question, I downloaded the VarScan program, but have some questions about how to use it.

I have 10 Samples, 2 from each patient, one tumor, one normal The data is from a Solid 4 sequencer, and is single end reads, 50BP in length.

So, it seems to be possible to get CNVs using the varscan copyCaller. But how to use it? java -jar VarScan.v2.2.5.jar copyCaller -h prints:

USAGE: VarScan copyCaller [output.copynumber] OPTIONS
    OPTIONS:
    --regions-file    A list of regions (e.g. exons) to use for segmentation
    --output-file    Output file to contain the calls
    --min-coverage    Minimum read depth at a position to make a call [10]
    --min-region-size    Minimum size (in bases) for a region to be counted [10]

I am really not sure about the syntax and the input.

What is the input? Bam files? pileup files? Another format? If so, how does it look?
How do I use it? e.g. varscan copycaller normal.bam tumor.bam output.bam? (or same with pileup?)
Or is there any other "workflow", maybe pre and post analysis/steps?

With best,
Mario

cnv varscan • 7.7k views

ADD COMMENT • link updated 2.8 years ago by Ram 44k • written 13.3 years ago by Mdeng ▴ 530

1

Entering edit mode

Not a direct answer to your question, but I wrote this little python script: http://pypi.python.org/pypi/ngCGH for performing this analysis starting from BAM files. There is a script to use R and the DNAcopy package to segment the results.

ADD REPLY • link updated 2.8 years ago by Ram 44k • written 13.3 years ago by Sean Davis 27k

Ram · Answer 1 · 2011-07-29

3

Entering edit mode

13.3 years ago

Dan Koboldt ▴ 60

Mario,

You should actually use the "VarScan copynumber" command, not "copyCaller", as the latter expects an old format. The input should be

pileup or mpileup for normal sample
pileup or mpileup for tumor sample

The command usage is like this:

java -jar VarScan.jar copynumber [normal.pileup] [tumor.pileup] output-basename

The output of the above command will have the format:

chromosome chr_start chr_stop normal_depth tumor_depth log2_ratio

You can feed this into a copy number segmentation program; I recommend the DNAcopy library of the BioConductor project.

Please e-mail me or post to the VarScan sourceforge forum if you have further questions.

Sincerely,
Dan Koboldt
dkoboldt [at] genome [dot] wustl [dot] edu

ADD COMMENT • link updated 2.8 years ago by Ram 44k • written 13.3 years ago by Dan Koboldt ▴ 60

0

Entering edit mode

Great,

I am new at using varscan and did not know about the copynumber command. On your website its just listed in the java doc, isn't it?

Thanks a lot,
Mario

ADD REPLY • link updated 2.8 years ago by Ram 44k • written 13.3 years ago by Mdeng ▴ 530

0

Entering edit mode

Hey,

I am perfectly fine now. Everything worked. Now I am looking for a way to visualise this data. I think the R plots from the DNAcopy package don't look good with massive data.

Is there maybe a way to load this data to IGV or use another plot?

With best,
Mario

ADD REPLY • link updated 2.8 years ago by Ram 44k • written 13.3 years ago by Mdeng ▴ 530

0

Entering edit mode

Hi Mdeng,

Can you please share how did you end up visualizing your results from massive data.

cheers

ADD REPLY • link updated 2.8 years ago by Ram 44k • written 9.3 years ago by Chirag Nepal ★ 2.4k