Hi. for calculating Transcript Per Milone (TPM) from TCGA HTseq-count I need gene length. also, I used gene code V.22 for annotation which has different columns for each gene. I bring one record from annotation file as an example:
feature start end score strand frame gene_id gene_name
gene 3281801 32897826 . + . ENSG00000206557.5 TRIM71
full_length exon_length exon_num first_exon last_exon
79809 8685 4 ENSE00001538095.1 ENSE00001498538.5
one_transcript one_transcript_start one_transcript_end
ENST00000383763.5 32818018 32897826
As you see, for each ensemble gene, we have full_length and exon_length. Now, for TPM calculating I need to 'gene length'. please guide me on which length should I use for TPM?
Prior threads for consideration:
Calculating TPM Values
Calculating TPM from featureCounts output