Entering edit mode
10.0 years ago
manojkumar_bhosale
▴
80
Hi,
Q1) I am using Varscan for detecting the CNV and LOH variants from matched normal sample. I have generated two pileup files from sample BAM files using samtools. Below is the command I am using for generating the copynumber file. When I executed below command it ran for a while and generated copynumber file(~2 Mb's) and thrown and exception shown below.
I would really appreciate if sombody tells me what mistake I have made here.
$ java -jar VarScan.v2.3.7.jar copynumber normal.pileup tumor.pileup output.basename
Normal Pileup: normal.pileup
Tumor Pileup: tumor.pileup
Min coverage: 10
Min avg qual: 15
P-value thresh: 0.01
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 5
at net.sf.varscan.Copynumber.<init>(Copynumber.java:693)
at net.sf.varscan.VarScan.copynumber(VarScan.java:317)
at net.sf.varscan.VarScan.main(VarScan.java:198)
Q2) When I use same files with combined mpileup below parsing exception is thrown
$ java -jar VarScan.v2.3.7.jar copynumber normal_tumor.pileup testTN --mpileup 1
Min coverage: 10
Min avg qual: 15
P-value thresh: 0.01
Reading input from normal_tumor.pileup
Reading mpileup input...
Parsing Exception on line:
chr1 20141883 A 1 .$ > 0
Please help me to overcome these
how did you generate the mpileup files ?
Thanks for prompt reply !!!
Here are the commands I used to generate mpileup files
P.S: All of the above mpileup files were generated without any error
Please see Problem generating Varscn2 copynumber output, which this question duplicates
You've passed in a line with no coverage in the right-most sample, you should filter these out before passing to VarScan
Thanks a lot !! The program runs fine now.
But, will the removal of loci with zero coverage add some bias to Copy number results?
Not at that line. You've implicitly told Varscan to only consider sites with a min coverage of 10, so it should disregard that site since the total coverage is 1.
Unless your tumour samples are completely devoid of fibroblasts, immune cells etc you should have some coverage at any site that has decent depth in your normal sample, so I wouldn't worry about this as a source of bias.
Identifying large regions that have zero coverage in your tumour and good depth in your normal sample should be really simple if you want to double check though.
Could you please elaborate on what you mean by "a line with no coverage"? As per my understanding, the 4th column gives the number of reads covering a particular site. I changed all rows that had a 0 in column 4 to a 1 but that did not solve my problem. I am still getting the same error.
Thanks in advance for your help :)