I want to convert count to TPM similar to this post Calculating TPM Values . However my question is can we use the width column from the GTF file as gene length for TPM calculation?
library("GenomicFeatures")
gtf_txdb <- makeTxDbFromGFF("gencode.v33.annotation.gtf.gz")
gene_list <- genes(gtf_txdb)
gene_list <- as.data.frame(gene_list)
gene_list[1:3, 1:4]
seqnames start end width
ENSG00000000003.15 chrX 100627108 100639991 12884
ENSG00000000005.6 chrX 100584936 100599885 14950
ENSG00000000419.12 chr20 50934867 50958555 23689
Attention to function name :
and not
for version 1.40 and older of GenomicFeatures,at least.
That's the sum of all the exons for that genes in your gtf. What if in your sample, all the reads are coming from a transcript which is half that length?