Question regarding Busco full_table.tsv report
0
0
Entering edit mode
4.5 years ago

Hi there. For context, I build a transcriptome for my species, where I first mapped my reads to the genome and then used stringtie to get the transcript sequences. I evaluated using Busco (and the eudicots_odb10 dataset) and used not only the full transcriptome but also a set of sequences containing only the largest sequence from each gene I noticed, however, that when I only used the longest transcript, the fragmented percentage of genes increased a lot:

StringTie transcriptome     eudicots_odb10  97.80%  26.20%  71.60%  0.60%   1.60%   2326(100%) 
StringTie transcriptome longest transcript  eudicots_odb10  83.10%  80.00%  3.10%   7.40%   9.50%   2326 (100%)

I wanted to look at full_table.tsv file to understand why that happens, and I noticed for some cases the length collum value is smaller in the longest transcript dataset, which is at first counterintuitive. I´m also having trouble understanding where this "length values" come from, but the documentation on Busco did not help

Thanks in advance!

BUSCO RNA-Seq • 898 views
ADD COMMENT

Login before adding your answer.

Traffic: 2577 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6