Hi community,
I have used tr2aacds.pl from EvidentialGenes workflow to remove redundant transcripts produced by Trinity de novo.
It generated a couple of files in the okayset (okalt.aa, okalt.cds, okalt.fasta, okay.aa, okay.cds, okay.fasta). I read upon some journals you need to use the primary and alternative sequences (I figure it is okay.cds, okalt.cds files).
I concatenated okay.cds and okalt.cds and mapped the reads to the assembly, however, it resulted a dramatic mapping rate reduced from 80% to 35%.
Then I did the same to okay.fasta and okalt.fasta, this time the mapping rate is around 77%.
I am not sure so I want to understand the association between the cds files and the fasta files generated by tr2aacds.pl
Thank you,
xp
How did you concatenated your okay.cds files? I have eight samples. I assembled them individually using Trinity. Following, i used Evigene in each individual assembled sample obtaining eight okay.cds files. Now, i want to concatenate them in order to posteriorly run Busco in all concatenated samples.