Entering edit mode
3.7 years ago
shail.nair05
▴
20
I am trying to convert featurecounts output (raw reads of transcripts) to TPM via tpm_rpkm.R(https://github.com/andysaurin/tpm_rpkm) script. but i am getting error saying Error in apply(counts, 2, function(x) rpkm(x, lengths)) : dim(X) must have a positive length
Here is the script
#! /usr/bin/env Rscript
# Author: Andy Saurin (--------------------)
#
# Simple RScript to calculate RPKMs and TPMs
# based on method for RPKM/TPM calculations shown in http://www.rna-seqblog.com/rpkm-fpkm-and-tpm-clearly-explained/
#
# The input file is the output of featureCounts
#
rpkm <- function(counts, lengths) {
pm <- sum(counts) /1e6
rpm <- counts/pm
rpm/(lengths/1000)
}
tpm <- function(counts, lengths) {
rpk <- counts/(lengths/1000)
coef <- sum(rpk) / 1e6
rpk/coef
}
## read table from featureCounts output
args <- commandArgs(T)
tag <- tools::file_path_sans_ext(args[1])
cat('Reading in featureCounts data...')
ftr.cnt <- read.table(args[1], sep="\t", header=T, quote="") #Important to disable default quote behaviour or else genes with apostrophes will be taken as strings
cat(' Done\n')
if ( ncol(ftr.cnt) < 7 ) {
cat(' The input file is not the raw output of featureCounts (number of columns > 6) \n')
quit('no')
}
lengths = ftr.cnt[,6]
counts <- ftr.cnt[,7:ncol(ftr.cnt)]
cat('Performing RPKM calculations...')
rpkms <- apply(counts, 2, function(x) rpkm(x, lengths) )
ftr.rpkm <- cbind(ftr.cnt[,1:6], rpkms)
rpkms <- apply(counts, 2, function(x) rpkm(x, lengths) )
ftr.rpkm <- cbind(ftr.cnt[,1:6], rpkms)
write.table(ftr.rpkm, file=paste0(tag, "_rpkm.txt"), sep="\t", row.names=FALSE, quote=FALSE)
cat(' Done.\n\tSaved as ')
cat ( paste0(tag, "_rpkm.txt", '\n') )
cat('Performing TPM calculations...')
tpms <- apply(counts, 2, function(x) tpm(x, lengths) )
ftr.tpm <- cbind(ftr.cnt[,1:6], tpms)
write.table(ftr.tpm, file=paste0(tag, "_tpm.txt"), sep="\t", row.names=FALSE, quote=FALSE)
cat(' Done.\n\tSaved as ')
cat ( paste0(tag, "_tpm.txt", '\n') )
quit('no')
**command output**
Rscript tpm_rpkm.R 450-3-hard_filtered.featureCounts
Reading in featureCounts data... Done
Performing RPKM calculations...Error in apply(counts, 2, function(x) rpkm(x, lengths)) :
dim(X) must have a positive length
halt execution
My featurecount table looks like this
Geneid | Chr | Start | End | Strand | Length |
1_1 | NODE_1_length_59711_cov_84.026979_g0_i0 | 116 | 904 | + | 789 | 198
1_2 | NODE_1_length_59711_cov_84.026979_g0_i0 | 1178 | 3514 | - | 2337 | 2294
1_3 | NODE_1_length_59711_cov_84.026979_g0_i0 | 3618 | 4319 | + | 702 | 502
1_4 | NODE_1_length_59711_cov_84.026979_g0_i0 | 4337 | 4921 | + | 585 | 320
1_5 | NODE_1_length_59711_cov_84.026979_g0_i0 | 4953 | 5906 | + | 954 | 799
1_6 | NODE_1_length_59711_cov_84.026979_g0_i0 | 5920 | 7056 | + | 1137 | 532
1_7 | NODE_1_length_59711_cov_84.026979_g0_i0 | 7061 | 8071 | + | 1011 | 761
1_8 | NODE_1_length_59711_cov_84.026979_g0_i0 | 8068 | 8766 | + | 699 | 188
1_9 | NODE_1_length_59711_cov_84.026979_g0_i0 | 8766 | 9656 | + | 891 | 217
1_10 | NODE_1_length_59711_cov_84.026979_g0_i0 | 9640 | 10710 | + | 1071 | 408
1_11 | NODE_1_length_59711_cov_84.026979_g0_i0 | 10692 | 11348 | + | 657 | 162
1_12 | NODE_1_length_59711_cov_84.026979_g0_i0 | 11359 | 12282 | + | 924 | 342
Does anyone know how to deal with this?
should be
That did not work. It threw error saying
The input file is not the raw output of featureCounts (number of columns > 6)
But i got the solution from https://stackoverflow.com/questions/66762378/error-in-applycounts-2-functionx-rpkmx-lengths-dimx-must-have-a-pos
Updating the counts <- ftr.cnt[,7:ncol(ftr.cnt), drop=FALSE]
worked smoothly.
Thanks
Posting the same question on multiple forums is bad etiquette. Even if you do cross-posts, it is common decency to mention your cross-posts so people don't invest effort when you've already been helped elsewhere.
Sorry for that. will keep in mind next time. thanks