Question

BWA generating a sam file without alignment information

0

Entering edit mode

2.9 years ago

kelsey.e.witt • 0

I am trying to map some raw sequence reads to a reference genome and a reference mitochondrial genome. The reads are paired-end and have already been trimmed and filtered so that the two paired files contain only matched reads (ie mate pairs). We are using the same fastq files as input and the exact same commands for mapping, but the frustrating part is that the sam file with the whole genome reads looks just fine, while the sam file with the mitochondrial genome reads isn't storing any information about the reference file or the position of the reads. Both have been indexed using the "bwa index" command (using no flags). We have tried downloading different mitochondrial genome references but the outcome is the same. Any insight would be appreciated!

Here are the commands for the mitochondrial and whole genome mapping:

bwa mem bos_mito.fasta B10_trimmed.pair1.truncated B10_trimmed.pair2.truncated > B10_mito_bwa.sam
bwa mem bos_genome.fa B10_trimmed.pair1.truncated B10_trimmed.pair2.truncated > B10_genome_bwa.sam

This is an example line for the sam file using the whole genome reference:

A00835:368:HVTNKDSX2:1:2573:13114:26678 99      chr1    182     40      55S19M59S       =       182     19      TCCACTTCGTCGCGCTGCGCGCGCGACGGGCGAGCCGCGCGCATGACCACGCGCACGACCCCCACGCTGCCCACGCGCACTCGGCCCACGCGCAGGAAGCTCCTGCGCACGCCGGAGACGGCCCGGGCGCCCT       FFFFFF:FF:FFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:F:FFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF,FFFFFFF       NM:i:0  MD:Z:19 MC:Z:55S19M59S  AS:i:19 XS:i:0

And the mito reference:

A00835:368:HVTNKDSX2:1:1101:1036:1000   77      *       0       0       *       *       0       0       GGATTGAGCGCCGCCAGTCGCGCCGGGACATCCANNNNNNNCCTGAAACTTTCTCGATGGTTTCGCCGCCGCGGAGGTAGGCGACGGCGTACAACGCCTCGCCAACTGCGAAGAGGCCGACCGCGAGCACCACGACGTCGATGCCGTCGAG     FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FF#######FFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF     AS:i:0  XS:i:0

Any insights as to what we can do to make the mitochondrial mapping work properly would be appreciated!

sam bwa • 1.0k views

ADD COMMENT • link updated 22 months ago by Ram 44k • written 2.9 years ago by kelsey.e.witt • 0

1

Entering edit mode

This means that your reads are not aligning to mitochondrial reference. Do you expect them to align for sure?

ADD REPLY • link 2.9 years ago by GenoMax 148k

1

Entering edit mode

Not really an answer, but if you have WGS reads you should align to all chromosomes including the mito at once.

Also, use samtools stats OR flatstag OR idxstats, then multiqc to assess if your reads have really been aligned to the mitochondrion.

When just testing and getting used to the tools, you can just align single ended data to see if you get more alignments (PE more specific of course, but just for bug checking).

Maybe also get a bigger (?) public dataset for your organism and test vs your genome assembly as well (you can get data with sra-explorer https://sra-explorer.info/ very easily.

Lastly, with bwa mem, map using the -t option for a big speedup, eg

bwa mem -t 8

ADD REPLY • link 2.9 years ago by colindaven 7.0k

0

Entering edit mode

are you saying that none of the reads match? run and post the output of

samtools flagstat or samtools idxstats

for us to actually see the numbers

ADD REPLY • link 2.9 years ago by Istvan Albert 102k