Entering edit mode
2.1 years ago
Rob
▴
170
Hi friends
I am using the ...workflow to download & merge the TCGA-KIRC RNAseq
https://brh.data-commons.org/dashboard/Public/notebooks/GDC_TCGA-CHOL_RNA_analysis_BRH_040722.html
However, in part of the code for merging I get an error.
Here is the chunk of code:
# Define the function to merge all RNAseq quantification files into one datadrame
merge_rna <-function(metadata, fdir){
filelist <- list.files(fdir, pattern="*.tsv$",
recursive = TRUE, full.names=TRUE)
for (i in 1:length(filelist)){
iname <- basename(filelist[i])
isamplename <- metadata[metadata$file_name==iname, "sample"]
idf <- read.csv(filelist[i], sep="\t", skip=1, header=TRUE)
# remove first 4 rows
remove <- 1:4
idf_subset <- idf[-remove, c("gene_id","unstranded")]
rm(idf)
names(idf_subset)[2] <- isamplename
#print(dim(idf_subset))
if (i==1){
combined_df <- idf_subset
rm(idf_subset)
} else {
combined_df <- merge(combined_df, idf_subset, by.x='gene_id', by.y="gene_id", all=TRUE)
rm(idf_subset)
}
}
# remove certain gene ids
combined_df <- combined_df[!(grepl("PAR_Y", combined_df$gene_id, fixed=TRUE)),]
# modify gene_id
combined_df$gene_id <- sapply(strsplit(combined_df$gene_id,"\\."), `[`, 1)
# use gene_id as row names and remove gene_id column
rownames(combined_df) <- combined_df$gene_id
combined_df <- combined_df[,-which(names(combined_df) %in% c("gene_id"))]
return(combined_df)
}
rnaCounts <- merge_rna(metaMatrix.RNA, "TCGA-KIRC/RNAseq") # Iget error here
rnaCounts[1:5,]
The error is :
Error in names(idf_subset)[2] <- isamplename :
replacement has length zero
It will be very helpful if anyone can help me fixing this error and merging data.
Thanks