Entering edit mode
3.6 years ago
gina.abdelaal
•
0
Hello Biostars,
I am aiming to analyse CNV data associated with TCGA BRCA through GAIA specifically chromosome 8. I have created a script based on this previous thread https://www.biostars.org/p/311199/#311746. I am getting errors when I start GAIA analysis and I am hoping for any guidance on this matter.
...................................................
#load libraries
library(TCGAbiolinks)
library(gaia)
library(dplyr)
library(tidyr)
# download tcga CNV data
project <- GDCquery("TCGA-BRCA", data.category = "Copy Number Variation",
data.type = "Masked Copy Number Segment", legacy = FALSE)
GDCdownload(project)
BRCA <- GDCprepare(project)
write.table(BRCA, file="BRCA.txt") #save file
## Create marker file
url<- "https://gdc.cancer.gov/files/public/file/snp6.na35.liftoverhg38.txt.zip"
temp <- tempfile()
download.file(url = url, temp)
unzip(temp)
probes_metadata<- read.table("snp6.na35.liftoverhg38.txt", sep = "\t",as.is = TRUE)
colnames(probes_metadata) <- probes_metadata[1,] #rename columns
probes_metadata <- probes_metadata[-1,]
probes_metadata=probes_metadata[probes_metadata[,"freqcnv"]==FALSE,] #get rid of unfrequent CNVs
colnames(probes_metadata)[1:4] <- c("Probe_name","Chromosome", "Start", "Strand")
head(probes_metadata)
unique(probes_metadata$Chromosome)
probes_metadata[which(probes_metadata$Chromosome=="X"),"Chromosome"] <- 23 # rename chx and y
probes_metadata[which(probes_metadata$Chromosome=="Y"),"Chromosome"] <- 24
probes_metadata$Chromosome <- sapply(probes_metadata$Chromosome, as.integer)
markerID <- apply(probes_metadata, 1, function(x) paste0(x[2], ":", x[3]))
markersMatrix <- probes_metadata[-which(duplicated(markerID)),]
markers_obj <- load_markers(markersMatrix) # marker file
## convert segment means into 1/0
synthCNV_Matrix <- cbind(BRCA,Label=NA)
synthCNV_Matrix[synthCNV_Matrix[,"Segment_Mean"] < -0.2,"Label"] <- 0
synthCNV_Matrix[synthCNV_Matrix[,"Segment_Mean"] > 0.2,"Label"] <- 1
synthCNV_Matrix <- synthCNV_Matrix[!is.na(synthCNV_Matrix$Label),]
#rearrange columns
synthCNV_Matrix<- synthCNV_Matrix[,c(7,2,3,4,5,8)]
colnames(synthCNV_Matrix)<- c("Sample.Name", "Chromosome", "Start", "End", "Num.of.Markers", "Aberration")
#Replace x and y chromosome names
xidx <- which(synthCNV_Matrix$Chromosome=="X")
yidx <- which(synthCNV_Matrix$Chromosome=="Y")
synthCNV_Matrix[xidx,"Chromosome"] <- 23
synthCNV_Matrix[yidx,"Chromosome"] <- 24
synthCNV_Matrix$Chromosome <- sapply(synthCNV_Matrix$Chromosome,as.integer)
#number of unique samples
n <- length(unique(synthCNV_Matrix$Sample.Name))
#CNV analysis with gaia
cnv_obj<- load_cnv(synthCNV_Matrix, markers_obj, n)
results.er <- runGAIA(synthCNV_Matrix, markers_obj, output_file_name="Tumor.all.txt", aberrations=-1, chromosomes=-1, num_iterations=10, threshold=0.15)
....................................................
The following is the error message that I get
cnv_obj<- load_cnv(synthCNV_Matrix, markers_obj, n)
Loading Copy Number Data
.Error in start_index:end_index : argument of length 0
I would appreciate any help on this matter.
Hi, can you please post the error messages. Thank you in advance.
Hi I have updated my post to include the error message. Thank you.
Hi again. I had originally posted an answer, but it still seems to be returning an error at a later point, still for
load_cnv()
. Let me debug it.Note that your final line should be:
The problem likely relates to the fact that your markers matrix is [apparently] based on the SNP 6.0 array, which will not necessarily cover the data that is being retrieved via TCGAbiolinks. It's difficult to diagnose what is happening.
Note that if you want the BRCA CNV data, this can be retrieved in the same way as per my other thread, and then the original workflow that I wrote can also be followed.
There is a another potential solution mentioned here: https://support.bioconductor.org/p/111990/#9135686