BWA or STAR for RNAseq?
1
0
Entering edit mode
6.3 years ago
agata88 ▴ 870

Hi all!

I am doing RNAseq expression analysis. To count reads I've mapped reads into contigs with BWA and STAR.

I read that STAR is very popular in RNAseq mapping, but I was curious how well BWA can handle it.

As a results I have higher percentage of reads mapped for BWA aligner (78.24% vs 65.8%), and I would like to ask - are there any reasons why I should not use BWA for mapping RNAseq data?

Many thanks for any suggestions.

Best, Agata

RNA-Seq • 17k views
ADD COMMENT
9
Entering edit mode
6.3 years ago
h.mon 35k

BWA isn't splice aware, so is not appropriate if you are mapping RNAseq to the genome - unless you are dealing with bacteria, which have no introns.

If you are mapping to the transcriptome, you can use both, but I've seldom seem BWA used for this purpose, only Bowtie2 and STAR.

However, your mapping rate, specially with STAR, is not very good. Are you mapping to the genome or tanscriptome? What is the quality of the reference?

edit:

I read that STAR is very popular in RNAseq assembly

You mean popular in RNAseq mapping, no?

edit 2: you probably have to tweak STAR index build step to account for the small genome size. Set --genomeSAindexNbases to min(14, log2(GenomeLength)/2 - 1) - GenomeLength is in base pairs.

ADD COMMENT
0
Entering edit mode

Yes, mapping. I am mapping to bacteria transcriptome. The reference include super transcripts. How can I check the "quality of the reference"?

ADD REPLY
1
Entering edit mode

Although I am not sure, I think assembling super-transcripts: 1) is not necessary, as bacteria don't have differential splicing, 2) could lead to artifacts, such as creating chimeric genes.

ADD REPLY
1
Entering edit mode

I became confused so I asked the Author of Super Transcripts, and the answer is here: https://groups.google.com/forum/#!category-topic/oshlack-lab/supertranscript-analysis/aoSwVvo4IMQ

ADD REPLY
0
Entering edit mode

I used it to create long genes. After Trans-Abyss I have 163684 contigs. I wanted to create some kind of scaffolds and ended up with super transcripts since there is no reference to assemble. At the end I have ~3000 super transcripts for which 70% have blastp hit.

ADD REPLY
0
Entering edit mode
ADD REPLY
0
Entering edit mode

If you want to check quality of your reference, you can use several tools;

For completeness check;

BUSCO: it checks how many percentage your reference is completed. It uses several databases that have well annotated genes. Find best matched database in BUSCO databases, and use it.

For contamination check;

KRAKEN; it checks whether your reference is contaminated with other organisms; bacteria, viruses etc.

ADD REPLY
0
Entering edit mode

The reference include super transcripts.

You don't have a genomic DNA reference? Since this is bacterial RNAseq you should be able to use any NGS aligner as noted by @h.mon above. Perhaps you need to tell STAR not to look for splicing and that will improve alignment results.

ADD REPLY
1
Entering edit mode

I don't have a reference, I am doing de novo RNAseq for no model organism. I used bowtie2, which gave me similar results to BWA, but it's more popular in this kind of analyses. I checked STAR with option

--alignIntronMax 1

but it did't change anything - still 65.8%.

Thank you very much for help!

ADD REPLY
0
Entering edit mode

Have you checked your transcriptome assembly befogging moving mapping of RNA reads to it?

ADD REPLY

Login before adding your answer.

Traffic: 1973 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6