Dear all,
I am using Oases to assembly 15 paired-end RNA-SEQ data (5 tissues with 3 repeats of unknown refer genome) to generate
a) transcriptome of each sample, which mean 15 xx.fa files,
b) reference transcriptome based all 15 xx.fa form step a).
Taking the tissue type as A,B,C,D, E, and 1,2,3 as their repeat id. I using two following strategies to got the reference transcriptome:
1) using mutliple k-mer (31,41, 51,61) to generate 15 transcriptome of each sample(15 transcripts.fa). And then assembly these 15 transcripts.fa to reference transcriptome by single k-mer (51). (In later step, I choose single k-mer because mutilple k-mer not working).
2) using single k-mer 51 to generate 15 transcriptome of each sample(15 transcripts.fa). And then assembly these 15 transcripts.fa to reference transcriptome by single k-mer (51).
In our preliminary analysis, the reference transcriptome generated by strategy 1) seem much better compared to strategy 2). I am not sure the strategy 1) is wright way, could anyone give me some suggestions? Many thanks.
Thanks, Mark.
We choose Oases, cause it more computational efficient and fast. But we also tried Trinity. As Oases have multiple k-mers which theoretical and practical good compared to single k-mer, we keeping to using Oases.
As assembly 15 PE at same time required heavy memory, so that's why we use the current strategies.