Comparing Alignment Files (CRAM)
0
0
Entering edit mode
19 months ago
joe_genome ▴ 50

Hello all,

Just checked different forums and generally, I see that it would be useful to use samtools or picard-tools for comparing alignment files. Here I want to compare the aligned output files using two different alignment algorithms. In this case, I had some general questions:

  1. I was wondering if it would be fine to compare files in their CRAM format or preferable to convert my CRAM file back to BAM?
  2. Is samtools preferable here, was considering just using samtools flagstat?
  3. What other metrics should be taken into consideration?

Thanks for the input :)

ngs sequencing genome • 915 views
ADD COMMENT
1
Entering edit mode

What is the use case for comparing alignment files? What criteria do you want to "compare" the files on?

Many NGS aligners work non-deterministically (i.e they may/will produce slightly different results for each run). So unless your aligner has a deterministic mode the alignment files will not be identical and thus can't be byte compared. (Note: this is my understanding as a non-computer scientist and may be incorrect).

ADD REPLY
0
Entering edit mode

GenoMax Was thinking of considering "functional equivalence" and considering general alignment statistics for the alignment (BAM/CRAM) files, which includes the mapping rate, insert size distribution, read quality distribution, and GC bias.

ADD REPLY
1
Entering edit mode

If the comparisons are at an aggregate level like what you are describing then it should be reasonable. flagstat can indeed be a simple/rapid test. CollectInsertSizeMetrics (Picard) or bamPEFragmentSize (deepTools) could be for those.

Question is would differences (if any) be meaningful since most aligners should be producing reasonable results (unless one was not appropriate e.g. using a non-splice aware aligner with spliced data).

ADD REPLY

Login before adding your answer.

Traffic: 2832 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6