Question

STAR or Bowtie for small RNA seq?

1

Entering edit mode

6.2 years ago

max_19 ▴ 170

Hi all!

I am analyzing some small RNA seq data. To map reads to genome, i'm wondering if STAR or bowtie would be a better fit for my data.

My reads are between 15-30bp in length.

Many thanks for your suggestions.

RNA-Seq mapping reads sequencing • 13k views

ADD COMMENT • link updated 6.2 years ago by Bastien Hervé 6.2k • written 6.2 years ago by max_19 ▴ 170

0

Entering edit mode

How did you get reads with 15-30bp long ? Those will be hard to align properly

ADD REPLY • link 6.2 years ago by Bastien Hervé 6.2k

2

Entering edit mode

This would be quite expected in smallRNA-seq data after trimming sequencing adaptors and filtering low quality reads.

ADD REPLY • link 6.2 years ago by Emilio Marmol ▴ 180

0

Entering edit mode

As I said in my answer below I missed this part reading the post :) Monday morning pleasures

ADD REPLY • link 6.2 years ago by Bastien Hervé 6.2k

0

Entering edit mode

or NovoAlign? should we also consider it in the comparison, if not, why so?

ADD REPLY • link 4.9 years ago by Ömer An ▴ 270

score 3 · Answer 1 · 2019-01-27

3

Entering edit mode

6.2 years ago

mbelmadani ★ 1.4k

I don't think Bowtie(2) is splice aware, you'd want STAR since this is RNA-Seq, or Tophat2. HISAT2 is another one.

Personally I really like STAR and it does well in peer reviewed benchmarks. And there's a parameter to share the memory between concurrent processes to align multiple samples at once.

ADD COMMENT • link 6.2 years ago by mbelmadani ★ 1.4k

2

Entering edit mode

STAR and HISAT2 are splice aware but becareful with Tophat

Please stop using Tophat https://t.co/Es4ohxOEyx Cole and I developed the method in *2008*. It was greatly improved in TopHat2 then HISAT & HISAT2. There is no reason to use it anymore. I have been saying this for years yet it has more citations this year than last #methodsmatter
— Lior Pachter (@lpachter) December 2, 2017

ADD REPLY • link 6.2 years ago by Bastien Hervé 6.2k

1

Entering edit mode

It does not need to be splice-aware. smallRNAs do typically not undergo splicing and one aligns against an existing database like miRbase for microRNA instead of the genome, requiring ungapped alignments tuned for very short reads, which is what bowtie is very good at (and bowtie2 not, because it performs better at longer read lengths).

ADD REPLY • link 6.2 years ago by ATpoint 87k

1

Entering edit mode

Thank you. Being splice aware is definitely preferred. I can see from the manual that STAR also outputs a SJ.out.tab file which contains splice junctions in tab-delimited format. Does this mean that it is essentially able to identify junction-mapping small RNAs?

ADD REPLY • link 6.2 years ago by max_19 ▴ 170

ATpoint · Answer 2 · 2019-01-28

3

Entering edit mode

6.2 years ago

Emilio Marmol ▴ 180

We get pretty decent alignment rates and accurate results with Bowtie and following specifications:

bowtie -n 1 -l 10 -m 100 -k 1 --best --strata

ADD COMMENT • link updated 6.2 years ago by ATpoint 87k • written 6.2 years ago by Emilio Marmol ▴ 180

score 3 · Answer 3 · 2019-01-28

3

Entering edit mode

6.2 years ago

Bastien Hervé 6.2k

I missed the fact you were using small RNA-seq data. Your sequences are too short to be analyze with classic RNA-seq tools, see also

Best/right way to quantify small RNA transcripts

But if you want to stick with STAR, here are some advises from Alexander Dobin, one of the STAR authors, to align miRNA

ADD COMMENT • link 6.2 years ago by Bastien Hervé 6.2k

1

Entering edit mode

Thanks so much for that link, very helpful! I ended up giving STAR a go, with the recommended parameter settings in that link. Below is my final log output, i think the reads are mapping pretty well!

                  Number of input reads |    39129818
              Average input read length |    22
                            UNIQUE READS:
           Uniquely mapped reads number |    31732915
                Uniquely mapped reads % |    81.10%
                  Average mapped length |    21.73
               Number of splices: Total |    1388166
    Number of splices: Annotated (sjdb) |    1388166
               Number of splices: GT/AG |    1380537
               Number of splices: GC/AG |    5808
               Number of splices: AT/AC |    46
       Number of splices: Non-canonical |    1775
              Mismatch rate per base, % |    0.20%
                 Deletion rate per base |    0.00%
                Deletion average length |    1.00
                Insertion rate per base |    0.00%
               Insertion average length |    1.02
                     MULTI-MAPPING READS:
Number of reads mapped to multiple loci |    6018428
     % of reads mapped to multiple loci |    15.38%
Number of reads mapped to too many loci |    69
     % of reads mapped to too many loci |    0.00%
                          UNMAPPED READS:    % of reads unmapped: too many mismatches |    0.00%
         % of reads unmapped: too short |    3.08%
             % of reads unmapped: other |    0.44%

ADD REPLY • link 6.2 years ago by max_19 ▴ 170

0

Entering edit mode

hello, could you comment/show which parameters exactly you used for alignment? I am running extracellular vesicular data & i am having only 13% uniquely mapped, 15% multi mapped, rest are unmapped. I am using STAR as well.

ADD REPLY • link 5.3 years ago by anara92 • 0

0

Entering edit mode

Hi anara92; you might want to start your own question. There could be a lot of reasons why you're getting low mapping rates and often it doesn't even have to do with the parameters so you will get more precise help (and faster since it'll be asked to the whole Biostars community, not just people in this question.) You might also want to search the forums for "low mapping rate" and "RNA-Seq" or something like that so see if there's some hints from other questions.

For the parameters, if you still want to try them, I believe they're in that link max_19 posted (See the STAR Google Group, Alex Dobin's answer.)

ADD REPLY • link 5.3 years ago by mbelmadani ★ 1.4k