I noticed that average length of all mRNA transcripts from human (GRCh38) annotation was 90043.2 bp, which seems unusually high. The average length of mRNA should be 3-4,000 bp as I am aware.
Is there an explanation for this?
I calculated the average length from all entry records on GRCh38 RefSeq annotation for which the column 2 (feature type) information was “mRNA”. Below is the command that I used.
awk -v FS="\t" '$1!~"^#" && $3=="mRNA"{sum+=$5-$4;n++}END{print sum/n}' GCF_000001405.40_GRCh38.p14_genomic.gff
The “awk” that I’m using is mawk 1.3.4.