TCGA data analysis on R Studio -- result would be too long a vector
0
0
Entering edit mode
6.9 years ago
freuv ▴ 20

Hi,

I am running the 'Preprocessing of Gene Expression data (IlluminaHiSeq_RNASeqV2)' and 'TCGAanalyze_SurvivalKM: Correlating gene expression and Survival Analysis' R-commands as-is from the Bioconductor page for TCGAbiolinks (http://bioconductor.org/packages/release/bioc/vignettes/TCGAbiolinks/inst/doc/analysis.html#tcgaanalyze_survivalkm:_correlating_gene_expression_and_survival_analysis)

However, I run into the following error when running this command (as-is, from the manual) in R-studio.

for( i in 1: round(nrow(dataBRCAcomplete)/100)){
    message( paste( i, "of ", round(nrow(dataBRCAcomplete)/100)))
    tokenStart <- tokenStop
    tokenStop <-100*i
    tabSurvKM<-TCGAanalyze_SurvivalKM(clinical_patient_Cancer,
                                      dataBRCAcomplete,
                                      Genelist = rownames(dataBRCAcomplete)[tokenStart:tokenStop],
                                      Survresult = F,
                                      ThreshTop=0.67,
                                      ThreshDown=0.33)

    tabSurvKMcomplete <- rbind(tabSurvKMcomplete,tabSurvKM)
}

Error: Error in 1:lastelementTOP : result would be too long a vector

Since I am using the example provided by Bioconductor, not sure what is the problem.

Any help would be much appreciated!

RNA-Seq R TCGA cancer • 6.4k views
ADD COMMENT
0
Entering edit mode

Have you additionally executed the following before the for loop:

clinical_patient_Cancer <- GDCquery_clinic("TCGA-BRCA","clinical")
dataBRCAcomplete <- log2(BRCA_rnaseqv2)

tokenStop<- 1

tabSurvKMcomplete <- NULL
ADD REPLY
0
Entering edit mode

Yes, I executed their example script as is.

ADD REPLY
0
Entering edit mode

Okay, how much free RAM have you got?; 32- or 64-bit machine?; R version?; operating system and version?

ADD REPLY
0
Entering edit mode

Hi Kevin, sorry for the late response -- did not see your message. I'm running RStudio on a Mac (Sierra), R version 3.4.3. 64-bit.

ADD REPLY
0
Entering edit mode

Maybe 2GB of free RAM?

ADD REPLY
0
Entering edit mode

May not be enough. I have 16GB RAM on my personal laptop. Can you try to reduce the size of the data and at least see if the code runs to completion?

ADD REPLY
0
Entering edit mode

Is there a way to split a matrix by nrows and write to n new matrices?

ADD REPLY
0
Entering edit mode

You could just take the first 500 rows as a test, like this:

matTest <- MyMatrix[1:500, ]
ADD REPLY
0
Entering edit mode

This is the output on the test.

1 of  5
0.2 of  5
97.96.95.94.93.92.91.90.89.88.87.86.85.84.83.82.81.80.79.78.77.76.75.74.73.72.71.70.69.68.67.66.65.64.63.62.61.60.59.58.57.56.55.54.53.52.51.50.49.48.47.46.45.44.43.42.41.40.39.38.37.36.35.34.33.32.31.30.29.28.27.26.25.24.23.22.21.20.19.18.17.16.15.14.13.12.11.10.9.8.7.6.5.4.3.2.1.0.3 of  5
96.95.94.93.92.91.90.89.88.87.86.85.84.83.82.81.80.79.78.77.76.75.74.73.72.71.70.69.68.67.66.65.64.63.62.61.60.59.58.57.56.55.54.53.52.51.50.49.48.47.46.45.44.43.42.41.40.39.38.37.36.35.34.33.32.31.30.29.28.27.26.25.24.23.22.21.20.19.18.17.16.15.14.13.12.11.10.9.8.7.6.5.4.3.2.1.0.4 of  5
100.99.98.97.96.95.94.93.92.91.90.89.88.87.86.85.84.83.82.81.80.79.78.77.76.75.74.73.72.71.70.69.68.67.66.65.64.63.62.61.60.59.58.57.56.55.54.53.52.51.50.49.48.47.46.45.44.43.42.41.40.39.38.37.36.35.34.33.32.31.30.29.28.27.26.25.24.23.22.21.20.19.18.17.16.15.14.13.12.11.10.9.8.7.6.5.4.3.2.1.0.5 of  5
95.94.93.92.91.90.89.88.87.86.85.84.83.82.81.80.79.78.77.76.75.74.73.72.71.70.69.68.67.66.65.64.63.62.61.60.59.58.57.56.55.54.53.52.51.50.49.48.47.46.45.44.43.42.41.40.39.38.37.36.35.34.33.32.31.30.29.28.27.26.25.24.23.22.21.20.19.18.17.16.15.14.13.12.11.10.9.8.7.6.5.4.3.2.1.0.>

Results in empty tabSurvKM and tabSurvKMcomplete tables

ADD REPLY
0
Entering edit mode

I would contact the developers of the packages. In many situations, packages are not updated in new versions of R, and/or other dependency issues arise as new packages are released on Bioconductor without adequate testing. To further compound the problem, the TCGA consortium has been shifting their data around and one finds that links on Government-hosted websites (hosting the data) are broken.

I believe that the contact for TCGA biolinks is Tiago Silva in São Paulo, Brazil, where I frequently pass through.

ADD REPLY
0
Entering edit mode

Oh, just one, thing, please try it outside R Studio ('regular' R). I never use R Studio because it adds that one little extra thing to my analyses that could cause problems.

ADD REPLY
1
Entering edit mode

This is a good idea. Thanks for your input -- I will update progress here.

ADD REPLY
0
Entering edit mode

How did it go?

ADD REPLY

Login before adding your answer.

Traffic: 1635 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6