I just completed a transcriptome assembly (~ 50 Million paired end 100bps read sets) of Xenopus species without reference genome:
I followed velvet + oases multi-kmer approach with min coverage 5 and and merged the transcripts obtained from multiple Kmer (31 to 75 with step of 6) assembles: In the merged set I got around 200K Loci with around 700K total transcripts?
I was just wondering if these numbers in generally expected range? or high
@Istvan : Thank you; i will look over numbers from related species....
i used the CD-hit-est on the merged assembly at -c 0.9 it reduced the number to ~250K