Problems After Loading Varscan 2 Output Into Bioconductor.Dnacopy
2
0
Entering edit mode
11.7 years ago
sousuffer ▴ 20

I am a bit new to this, but I was able to generate copy number calls (and GC-adjusted calls). I moved towards the segmentation step of the workflow and, sucessfully importing the file in the BioConductor DNAcopy package using the command:

cn=read.table("varScan.copynumber.called",header=F)

I am having trouble executing the next step:

CNA.object <-CNA(genomdat = cn[,6], chrom = cn[,1], maploc = cn[,2], data.type = 'logratio') 
Error in CNA(genomdat = cn[, 6], chrom = cn[, 1], maploc = cn[, 2], data.type = "logratio") : genomdat must be numeric

I looked at cn [,6] and it contains entries such as:

[99889] 34.8 19.0 46.8 55.1 56.6 52.6 
[99895] 29.7 44.4 33.8 30.6 21.0 40.7 
[99901] 42.4 46.5 65.4 98.7 82.5 83.6

These are numeric, so I cannot figure out what my problem is. Any help would be greatly appreciated.

varscan copynumber cnv exome sequencing • 4.9k views
ADD COMMENT
1
Entering edit mode

Have you tried replacing cn[, 6] with as.numeric(cn[, 6]) ?

ADD REPLY
0
Entering edit mode
11.7 years ago

Data types are a little confusing in R. Those numbers are probably being read in as factors. I'd try this:

#Do a quick sanity check to make sure the conversion works as expected:
head(as.numeric(cn[,6]),50)

#if that looks good, change your command to:
CNA.object <-CNA(genomdat = as.numeric(cn[,6]), chrom = cn[,1], maploc = cn[,2], data.type = 'logratio')

Alternatively, you could also specify the column types when you read the data in. Something like this:

cn=read.table("varScan.copynumber.called",header=F,colClasses=c("character","numeric","numeric","numeric","numeric","numeric")
ADD COMMENT
0
Entering edit mode

Hi Chris,

I also ran into the same problem as sousuffer. But as you mentioned, I change the "genomdat = cn[,6]" into "genomdat = as.numeric(cn[,6])" and the same with maploc. The problem has been solved.

Thank you very much.

ADD REPLY
0
Entering edit mode

Hi Chris,

By using the as.numeric I have tackle the error problem of "genomdat must be numeric". But after using the as.numeric, my results are very strange. The "seg.mean" value are all like several hundred, while I saw that other's result that this value should be around -3 to 3 or something like this. Do you have any suggestions that I can do to get a proper seg.mean value.

Here is my command lines: (mostly follow the lines you provided for the Varscan website)

cn<-read.table("varScan.copynumber2.called",header=F)
CNA.object <-CNA(genomdat = as.numeric(cn[,7]), chrom = cn[,1], maploc =as.numeric(cn[,2]), data.type = 'logratio')
CNA.smoothed <- smooth.CNA(CNA.object)
segs <- segment(CNA.smoothed, verbose=0, min.width=2)
segs2 = segs$output
write.table(segs2[,2:6], file="use.logratio.cn.out", row.names=F, col.names=T, quote=F, sep="\t")

Thank you for your time.

ADD REPLY
0
Entering edit mode

without seeing any of your data, there's really no way to tell, but one thing that jumps out is that you've replaced as.numeric(cn[,6]) with as.numeric(cn[,7]) Are you sure you're extracting the right columns?

ADD REPLY
0
Entering edit mode

Sorry.the image is my input file of DNAcopy. As you see, column 7 is the logratio number [url=https://ibb.co/jf3sVc][img]https://preview.ibb.co/dk7aix/TIM_20180205203444.jpg[/img][/url]

and this is the result of the DNAcopy. [url=https://imgbb.com/][img]https://image.ibb.co/fzJy3x/TIM_20180205203947.jpg[/img][/url]

As you can see that the seg.mean column value are several hundred.

ADD REPLY
0
Entering edit mode
11.7 years ago
sousuffer ▴ 20

I ran your command and got the below output:

head(as.numeric(cn[,6]),50) [1] 1667 880 983 916 983 998 922 915 906 934 221 895 953 820 553 [16] 663 886 995 903 222 908 1214 1414 1427 444 884 955 229 850 822 [31] 338 1106 886 1579 498 1515 853 868 902 665 920 1189 930 757 1409 [46] 405 1372 554 827 910

I'm not sure what implies "if that looks good", but I'm going to guess that since the values in numeric format do not equal the actual decimal values, this isn't what we want. I then tried the following as a comparison (pre-conversion):

head((cn[,6]),50) [1] tumordepth 21.3 31.6 24.9 31.6 33.1
[7] 25.5 24.8 23.9 26.7 12.0 22.8
[13] 28.6 18.0 15.2 16.5 21.9 32.8
[19] 23.6 12.1 24.1 54.7 74.7 76.0
[25] 14.3 21.7 28.8 12.8 19.4 18.2
[31] 13.7 43.9 21.9 91.2 144.7 84.8
[37] 19.7 20.2 23.5 16.7 25.3 52.2
[43] 26.3 17.4 74.2 136.4 70.5 15.3
[49] 18.7 24.3
1667 Levels: 10.0 10.1 10.2 10.3 10.4 10.5 10.6 10.7 10.8 10.9 100.0 ... tumor
depth

Finally, I tried the alternate code and got the following:

cn=read.table("varScan.copynumber.called",header=F,colClasses=c("character","numeric","numeric","numeric","numeric","numeric")) Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : scan() expected 'a real', got 'chr_start'

I'm guessing the first line headers are messing this up?

Thanks!

ADD COMMENT
0
Entering edit mode

In the first method, try as.numeric(as.character(cn[,6])). In the second, why did you set "header=F" if there is a header row? Change that to "header=T" and see if it works. Also make sure that the number of designations in colClasses matches up with the number of columns in the file you're reading

ADD REPLY

Login before adding your answer.

Traffic: 2173 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6