Question

Problems After Loading Varscan 2 Output Into Bioconductor.Dnacopy

0

Entering edit mode

11.7 years ago

sousuffer ▴ 20

I am a bit new to this, but I was able to generate copy number calls (and GC-adjusted calls). I moved towards the segmentation step of the workflow and, sucessfully importing the file in the BioConductor DNAcopy package using the command:

cn=read.table("varScan.copynumber.called",header=F)

I am having trouble executing the next step:

CNA.object <-CNA(genomdat = cn[,6], chrom = cn[,1], maploc = cn[,2], data.type = 'logratio') 
Error in CNA(genomdat = cn[, 6], chrom = cn[, 1], maploc = cn[, 2], data.type = "logratio") : genomdat must be numeric

I looked at cn [,6] and it contains entries such as:

[99889] 34.8 19.0 46.8 55.1 56.6 52.6 
[99895] 29.7 44.4 33.8 30.6 21.0 40.7 
[99901] 42.4 46.5 65.4 98.7 82.5 83.6

These are numeric, so I cannot figure out what my problem is. Any help would be greatly appreciated.

varscan copynumber cnv exome sequencing • 4.9k views

ADD COMMENT • link updated 7.7 years ago by Biostar 20 • written 11.7 years ago by sousuffer ▴ 20

1

Entering edit mode

Have you tried replacing cn[, 6] with as.numeric(cn[, 6]) ?

ADD REPLY • link 11.7 years ago by Christof Winter ★ 1.0k

score 0 · Answer 1 · 2013-03-14

0

Entering edit mode

11.7 years ago

Chris Miller 22k

Data types are a little confusing in R. Those numbers are probably being read in as factors. I'd try this:

#Do a quick sanity check to make sure the conversion works as expected:
head(as.numeric(cn[,6]),50)

#if that looks good, change your command to:
CNA.object <-CNA(genomdat = as.numeric(cn[,6]), chrom = cn[,1], maploc = cn[,2], data.type = 'logratio')

Alternatively, you could also specify the column types when you read the data in. Something like this:

cn=read.table("varScan.copynumber.called",header=F,colClasses=c("character","numeric","numeric","numeric","numeric","numeric")

ADD COMMENT • link 11.7 years ago by Chris Miller 22k

0

Entering edit mode

Hi Chris,

I also ran into the same problem as sousuffer. But as you mentioned, I change the "genomdat = cn[,6]" into "genomdat = as.numeric(cn[,6])" and the same with maploc. The problem has been solved.

Thank you very much.

ADD REPLY • link 6.8 years ago by jinxinhao1988 ▴ 70

0

Entering edit mode

Hi Chris,

By using the as.numeric I have tackle the error problem of "genomdat must be numeric". But after using the as.numeric, my results are very strange. The "seg.mean" value are all like several hundred, while I saw that other's result that this value should be around -3 to 3 or something like this. Do you have any suggestions that I can do to get a proper seg.mean value.

Here is my command lines: (mostly follow the lines you provided for the Varscan website)

cn<-read.table("varScan.copynumber2.called",header=F)
CNA.object <-CNA(genomdat = as.numeric(cn[,7]), chrom = cn[,1], maploc =as.numeric(cn[,2]), data.type = 'logratio')
CNA.smoothed <- smooth.CNA(CNA.object)
segs <- segment(CNA.smoothed, verbose=0, min.width=2)
segs2 = segs$output
write.table(segs2[,2:6], file="use.logratio.cn.out", row.names=F, col.names=T, quote=F, sep="\t")

Thank you for your time.

ADD REPLY • link 6.8 years ago by jinxinhao1988 ▴ 70

0

Entering edit mode

without seeing any of your data, there's really no way to tell, but one thing that jumps out is that you've replaced as.numeric(cn[,6]) with as.numeric(cn[,7]) Are you sure you're extracting the right columns?

ADD REPLY • link 6.8 years ago by Chris Miller 22k

0

Entering edit mode

Sorry.the image is my input file of DNAcopy. As you see, column 7 is the logratio number [url=https://ibb.co/jf3sVc][img]https://preview.ibb.co/dk7aix/TIM_20180205203444.jpg[/img][/url]

and this is the result of the DNAcopy. [url=https://imgbb.com/][img]https://image.ibb.co/fzJy3x/TIM_20180205203947.jpg[/img][/url]

As you can see that the seg.mean column value are several hundred.

ADD REPLY • link 6.8 years ago by jinxinhao1988 ▴ 70

score 0 · Answer 2 · 2013-03-14

I ran your command and got the below output:

head(as.numeric(cn[,6]),50) [1] 1667 880 983 916 983 998 922 915 906 934 221 895 953 820 553 [16] 663 886 995 903 222 908 1214 1414 1427 444 884 955 229 850 822 [31] 338 1106 886 1579 498 1515 853 868 902 665 920 1189 930 757 1409 [46] 405 1372 554 827 910

I'm not sure what implies "if that looks good", but I'm going to guess that since the values in numeric format do not equal the actual decimal values, this isn't what we want. I then tried the following as a comparison (pre-conversion):

head((cn[,6]),50) [1] tumordepth 21.3 31.6 24.9 31.6 33.1
[7] 25.5 24.8 23.9 26.7 12.0 22.8
[13] 28.6 18.0 15.2 16.5 21.9 32.8
[19] 23.6 12.1 24.1 54.7 74.7 76.0
[25] 14.3 21.7 28.8 12.8 19.4 18.2
[31] 13.7 43.9 21.9 91.2 144.7 84.8
[37] 19.7 20.2 23.5 16.7 25.3 52.2
[43] 26.3 17.4 74.2 136.4 70.5 15.3
[49] 18.7 24.3
1667 Levels: 10.0 10.1 10.2 10.3 10.4 10.5 10.6 10.7 10.8 10.9 100.0 ... tumordepth

Finally, I tried the alternate code and got the following:

cn=read.table("varScan.copynumber.called",header=F,colClasses=c("character","numeric","numeric","numeric","numeric","numeric")) Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : scan() expected 'a real', got 'chr_start'

I'm guessing the first line headers are messing this up?

Thanks!