Question

TCGA data analysis on R Studio -- result would be too long a vector

0

Entering edit mode

6.9 years ago

freuv ▴ 20

Hi,

I am running the 'Preprocessing of Gene Expression data (IlluminaHiSeq_RNASeqV2)' and 'TCGAanalyze_SurvivalKM: Correlating gene expression and Survival Analysis' R-commands as-is from the Bioconductor page for TCGAbiolinks (http://bioconductor.org/packages/release/bioc/vignettes/TCGAbiolinks/inst/doc/analysis.html#tcgaanalyze_survivalkm:_correlating_gene_expression_and_survival_analysis)

However, I run into the following error when running this command (as-is, from the manual) in R-studio.

for( i in 1: round(nrow(dataBRCAcomplete)/100)){
    message( paste( i, "of ", round(nrow(dataBRCAcomplete)/100)))
    tokenStart <- tokenStop
    tokenStop <-100*i
    tabSurvKM<-TCGAanalyze_SurvivalKM(clinical_patient_Cancer,
                                      dataBRCAcomplete,
                                      Genelist = rownames(dataBRCAcomplete)[tokenStart:tokenStop],
                                      Survresult = F,
                                      ThreshTop=0.67,
                                      ThreshDown=0.33)

    tabSurvKMcomplete <- rbind(tabSurvKMcomplete,tabSurvKM)
}

Error: Error in 1:lastelementTOP : result would be too long a vector

Since I am using the example provided by Bioconductor, not sure what is the problem.

Any help would be much appreciated!

RNA-Seq R TCGA cancer • 6.4k views

ADD COMMENT • link updated 6.7 years ago by Biostar 20 • written 6.9 years ago by freuv ▴ 20

0

Entering edit mode

Have you additionally executed the following before the for loop:

clinical_patient_Cancer <- GDCquery_clinic("TCGA-BRCA","clinical")
dataBRCAcomplete <- log2(BRCA_rnaseqv2)

tokenStop<- 1

tabSurvKMcomplete <- NULL

ADD REPLY • link 6.9 years ago by Kevin Blighe 88k

0

Entering edit mode

Yes, I executed their example script as is.

ADD REPLY • link 6.9 years ago by freuv ▴ 20

0

Entering edit mode

Okay, how much free RAM have you got?; 32- or 64-bit machine?; R version?; operating system and version?

ADD REPLY • link 6.9 years ago by Kevin Blighe 88k

0

Entering edit mode

Hi Kevin, sorry for the late response -- did not see your message. I'm running RStudio on a Mac (Sierra), R version 3.4.3. 64-bit.

ADD REPLY • link 6.9 years ago by freuv ▴ 20

0

Entering edit mode

Maybe 2GB of free RAM?

ADD REPLY • link 6.9 years ago by freuv ▴ 20

0

Entering edit mode

May not be enough. I have 16GB RAM on my personal laptop. Can you try to reduce the size of the data and at least see if the code runs to completion?

ADD REPLY • link 6.9 years ago by Kevin Blighe 88k

0

Entering edit mode

Is there a way to split a matrix by nrows and write to n new matrices?

ADD REPLY • link 6.9 years ago by freuv ▴ 20

0

Entering edit mode

You could just take the first 500 rows as a test, like this:

matTest <- MyMatrix[1:500, ]

ADD REPLY • link 6.9 years ago by Kevin Blighe 88k

0

Entering edit mode

This is the output on the test.

1 of  5
0.2 of  5
97.96.95.94.93.92.91.90.89.88.87.86.85.84.83.82.81.80.79.78.77.76.75.74.73.72.71.70.69.68.67.66.65.64.63.62.61.60.59.58.57.56.55.54.53.52.51.50.49.48.47.46.45.44.43.42.41.40.39.38.37.36.35.34.33.32.31.30.29.28.27.26.25.24.23.22.21.20.19.18.17.16.15.14.13.12.11.10.9.8.7.6.5.4.3.2.1.0.3 of  5
96.95.94.93.92.91.90.89.88.87.86.85.84.83.82.81.80.79.78.77.76.75.74.73.72.71.70.69.68.67.66.65.64.63.62.61.60.59.58.57.56.55.54.53.52.51.50.49.48.47.46.45.44.43.42.41.40.39.38.37.36.35.34.33.32.31.30.29.28.27.26.25.24.23.22.21.20.19.18.17.16.15.14.13.12.11.10.9.8.7.6.5.4.3.2.1.0.4 of  5
100.99.98.97.96.95.94.93.92.91.90.89.88.87.86.85.84.83.82.81.80.79.78.77.76.75.74.73.72.71.70.69.68.67.66.65.64.63.62.61.60.59.58.57.56.55.54.53.52.51.50.49.48.47.46.45.44.43.42.41.40.39.38.37.36.35.34.33.32.31.30.29.28.27.26.25.24.23.22.21.20.19.18.17.16.15.14.13.12.11.10.9.8.7.6.5.4.3.2.1.0.5 of  5
95.94.93.92.91.90.89.88.87.86.85.84.83.82.81.80.79.78.77.76.75.74.73.72.71.70.69.68.67.66.65.64.63.62.61.60.59.58.57.56.55.54.53.52.51.50.49.48.47.46.45.44.43.42.41.40.39.38.37.36.35.34.33.32.31.30.29.28.27.26.25.24.23.22.21.20.19.18.17.16.15.14.13.12.11.10.9.8.7.6.5.4.3.2.1.0.>

Results in empty tabSurvKM and tabSurvKMcomplete tables

ADD REPLY • link updated 6.9 years ago by GenoMax 148k • written 6.9 years ago by freuv ▴ 20

0

Entering edit mode

I would contact the developers of the packages. In many situations, packages are not updated in new versions of R, and/or other dependency issues arise as new packages are released on Bioconductor without adequate testing. To further compound the problem, the TCGA consortium has been shifting their data around and one finds that links on Government-hosted websites (hosting the data) are broken.

I believe that the contact for TCGA biolinks is Tiago Silva in São Paulo, Brazil, where I frequently pass through.

ADD REPLY • link 6.9 years ago by Kevin Blighe 88k

0

Entering edit mode

Oh, just one, thing, please try it outside R Studio ('regular' R). I never use R Studio because it adds that one little extra thing to my analyses that could cause problems.

ADD REPLY • link 6.9 years ago by Kevin Blighe 88k

1

Entering edit mode

This is a good idea. Thanks for your input -- I will update progress here.

ADD REPLY • link 6.9 years ago by freuv ▴ 20

0

Entering edit mode

How did it go?

ADD REPLY • link 6.9 years ago by Kevin Blighe 88k