Entering edit mode
7.6 years ago
wangshx
▴
10
I retrieve CDS region from gtf
file (ensembl v75). When I check the region length of CDS, I am very curious about the following result
library(data.table)
annotation <- fread('Homo_sapiens.GRCh37.75.gtf')
> summary(annotation[annotation$V3=="CDS",]$V5 - annotation[annotation$V3=="CDS",]$V4)
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0 77.0 115.0 149.2 162.0 21692.0
More than 100 CDS regions have 0
length. Why? Please help me if you know, thanks a lot.
It is new for me that exists 1 base long microexons.