I use cufflinks to analyze a ~ 135 million reads experiments. The FPKM values vary from 0 to 100000. I did not use the -N option that sometimes can produce inflated FPKM, so I investigated some of the very large FPKM values.
The generally are associated with non-coding protein genes and have length approx 100. The number of alignments covering the regions are approx 3000.
Using the RPKm formula I cannot make sense of the large FPKM values.
Any explanation? Are these artifacts? If so, how can they be detected and filtered out?
Have you tried writing the authors (i.e. Cole)? Typically they are responsive. Of course, if they were to respond we'd love to see the answer.
If you're generally trying to quantify the abundance of a number of short transcripts, you might also try passing cufflinks the --no-effective-length-correction flag.