Bwa Aligner: How To Get Percentage Of Reads Aligned?
3
6
Entering edit mode
13.1 years ago
Doo ▴ 240

edited

Bowtie provides an score e.g. 90% telling us how many reads could be mapped.

How can I retrieve this from a BWA alignment? Is there a simple workflow available somewhere?

Thanks

rna short aligner • 22k views
ADD COMMENT
0
Entering edit mode

Are you talking about percentage of total reads aligned? Alignment identity? Alignment score is a raw score based on number of matches/mismathces/gaps. I don't think it's represented as a percentage.

ADD REPLY
0
Entering edit mode

Was talking about percentage of total reads aligned. Sorry for the confusion then.

ADD REPLY
13
Entering edit mode
13.1 years ago

<A href="&lt;a href=" http:="" samtools.sourceforge.net="" "="" rel="nofollow">http://samtools.sourceforge.net/">samtools's flagstat command can tell you basic information like number of reads aligned from looking at the flags in the sam/bam output. Use:

samtools flagstat bwaOutput.bam
ADD COMMENT
1
Entering edit mode
13.1 years ago

$ bwa aln $ref $input | bwa samse $ref - $input | samtools view -S -b -o $output - $ samtools index $output $ samtools flagstat $output

ADD COMMENT
0
Entering edit mode

'samtools import' doesn't seem to be in their latest documentation. I think I used to use it too, but 'view' is now the prefered command for converting .sam to.bam

You need to make a .bam from bwa's .sam, and you run flagstat on that. You don't have to have sorted it.

ADD REPLY
0
Entering edit mode

Thanks @swbarnes2. Commands are from script developed circa summer 2010 and clearly need to be updated.

ADD REPLY
0
Entering edit mode

updated as suggested.

ADD REPLY
0
Entering edit mode

It is slightly dumb that flagstat won't work on .sam files, but really, there's pretty much no reason not to pipe from bwa sampe/samse straight into samtools view to convert on the fly to .bam format. .sam format just takes up so much space, and there's very little that you can do on .sam files that you can't do on.bam files.

ADD REPLY
0
Entering edit mode
13.1 years ago
Kanne ▴ 450

This is the long way 'round, I know, but I usually extract the unmapped reads using a perl script which compares the alignment file to the input read file. I then count the number of lines (wc -l) on the input read file and unmapped read file (of course, dividing by 4 for fastq files).

The script I use is from this seqanswers post: http://seqanswers.com/forums/showthread.php?t=6847

Hope you find this useful!

ADD COMMENT

Login before adding your answer.

Traffic: 2448 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6