Question

Oases Running Problem

0

Entering edit mode

13.2 years ago

Shaojiang Cai ▴ 100

Friends,

I am trying to run Oases for transcriptome assembly. The result is far from expected, so I would like to ask whether I am running it in a right way? Thanks.

Here is my running command:

python scripts/oases_pipeline.py -m 25 -M 29 -o output -d " -strand_specific -shortPaired data/reads.fa" -p " -min_trans_lgth 100 -ins_length 300"

My library is strand-specific and pair-ended with length 67bp. The reads are shuffled as:

>0(left_mate_forwarded)
ACTC...
>1(right_mate_reverse_complemented)
TATA...

I got some transcripts, but are far from the transcripts annotated, also far from the result of Trinity. The longest contig from Oases is ~2500bp (vs. ~10000bp from cufflinks and ~6000bp from Trinity). The N50 value is also low. It only reports 20 contigs those cover full-length of some transcripts from Cufflinks (totally ~4000), while Trinity reports ~650.

The dataset I am using is a subset of S. pombe. Does it matter?

Could somebody help me point out whether something wrong here? Thanks.

transcriptome assembly velvet rna-seq • 2.7k views

ADD COMMENT • link 13.2 years ago by Shaojiang Cai ▴ 100

0

Entering edit mode

You didn't specify "-fasta" so if it was expecting fastq you'll only be using 50% of your reads. Did you revcom the right mate? If it's from Illumina, just leave it as-is. You are only trying 3 k-values, which look pretty low. If your reads are 100bp PE and you have enough of them, I'd try higher k-values.

ADD REPLY • link 12.8 years ago by Torst ▴ 980