Hello all:
I am working on the RNA-Seq project on non-model organism. The genome is not available for this organism but the mitochondria has been sequenced and published.
I want to remove all mt sequence from my original fastq files so I am going to align my reads (100bp pair-end) to mt genome. I want to use Bowtie 2. But I found Bowtie 2 is designed for "quickly aligning large sets of short DNA sequences (reads) to large genomes".
The animal mt genome is relatively small so I am wondering if Bowtie 2 still work? I want to use this because one of its output option is unmapped reads, which is exactly what I want.
I totally missed the RNAseq part of this :(
One caveat about STAR is that you have to tweak setting to map against small genomes. I'm think this is mentioned in the documentation, but I saw a thread about this on SEQanswers this week so apparently this isn't quite documented well enough.
Generally,
--genomeSAindexNbases
needs to be scaled with the genome length, as~min(14,log2(ReferenceLength)/2 - 1)
at the genome index generation step.A stupid question, but is there any splicing in the Mitochondria? Since it's more or less still a procaryote and the annotations I know don't include any junctions.
I happened to look this up and at least in some organisms the answer appears to be "yes, there's splicing".