determining what human reference genome a bam is aligned against
1
2
Entering edit mode
9.7 years ago
timodonnell ▴ 80

I'm looking at the DEL.bam file from the DELLY tests:

https://github.com/tobiasrausch/delly/tree/master/test

Its header is just two lines, and does not appear to give any indication of the reference genome it was aligned against:

@SQ     SN:chr16        LN:10000000
@PG     ID:bwa  PN:bwa  VN:0.6.1-r104

I am trying to figure out which reference genome it is aligned against. Anyone know of a way to do this?

next-gen samtools bam • 2.1k views
ADD COMMENT
1
Entering edit mode

Assuming its human geneome, open it in IGV against both hg19 and hg18, look which one it matched to. Non-reference would have lot of mismatches and looks ugly.

ADD REPLY
1
Entering edit mode
9.7 years ago

From top of my head:

  • Option 0: Trivial & best: Ask to the person who generated the file!
  • Option 1: Align the reads in the bam file to the most recent versions of the human genome using bwa 0.6.1 (I think bwa can take a bam file as input). Then see which version gives alignment positions identical to your bam file. Hopefully the result is going to be quite unambiguous.

It seems the header you posted has been edited since it shows only chr16 with length exactly 10000000bp.

ADD COMMENT
0
Entering edit mode

Appreciate the ideas. I was hoping to find a tool that looks at the contig names and CIGAR strings (and perhaps MDTags as well) of the reads in my BAM and figures out which of the standard human reference genomes are consistent with the alignments. But maybe such a tool doesn't exist?

ADD REPLY

Login before adding your answer.

Traffic: 1791 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6