Question

Newbie question: how to create BWA index from multiple .fasta files

0

Entering edit mode

7.0 years ago

phonybone • 0

I would like to use BWA MEM to align short reads against the entire hg19 human genome. To do that, I assume that I must create a BWA index. I understand how to create an index from a single file, but how does one create an index from multiple files? Is it best practice to just concatenate all files together? Or is there a better way?

Thanks in advance.

sequencing bwa index • 5.7k views

ADD COMMENT • link updated 7.0 years ago by swbarnes2 14k • written 7.0 years ago by phonybone • 0

0

Entering edit mode

Easy way may be to get a sequence/annotation/index bundle from the iGenomes site.

ADD REPLY • link 7.0 years ago by GenoMax 148k

score 1 · Answer 1 · 2017-12-19

I'm assuming you currently have each chromosome as a separate fasta file. If you don't have a scientific or logistical reason to avoid concatenating the files, then the most straightforward approach would be to concatenate those fasta files into a single one, and then compute your index. Otherwise you'll need to merge alignment data downstream, and you could miss out on things like discordant alignments and large structural abnormalities such as chromosomal translocation.

score 1 · Answer 2 · 2017-12-20

1

Entering edit mode

7.0 years ago

swbarnes2 14k

Instead of making a catted file like Dan suggests, you might be able to get away with catting them and piping that straight into bwa. Something like

cat *.fa | bwa index -p catted -

ADD COMMENT • link 7.0 years ago by swbarnes2 14k

0

Entering edit mode

Piping is awesome. But wouldn't that preclude running subsequent alignments using that index, unless the complete catted fasta file was saved somewhere?

This would work, I think:

cat *.fa | tee bwa index -p catted - > catted.fasta

ADD REPLY • link 7.0 years ago by Dan D 7.4k

0

Entering edit mode

Yes, this way does not make a catted fasta, but I think you can still use the index without it.

ADD REPLY • link 7.0 years ago by swbarnes2 14k