Ngs Reads Alignment
3
3
Entering edit mode
12.9 years ago
Leszek 4.2k

Do you know of any initiatives for NGS alignments compression?

BAM format offers compression, but still all aligned sequences and their qualities are stored. Do you know of any reference based compression? I think people at ENA are working in that matter. Have a look at CRAM.

What is you opinion about keeping qualities? Maybe using some quality thresholds is reasonable? Or storing qualities only for mismatches (and maybe for +- 3 bases)?

And what about sequence headers? Do we need to keep this at all in the alignment? Storing pair-end information should be enough in my opinion.

I'm really interested in your opinions:)

next-gen sequencing alignment bam snp • 3.2k views
ADD COMMENT
2
Entering edit mode

If it was just about saving space you could get rid of the FASTQ input data, after you stored all sequences and qualities in a BAM file. FASTQ can then be generated from the BAM file. It is a matter of compromise when discarding data. In this case discarding quality and sequence compromises re-analysis. Btw, your question is about compression, but also about discarding data != compression.

ADD REPLY
0
Entering edit mode

yes, I'm curious what are your opinions about lossless compression vs compression discarding some data (like sequence headers, some quals, etc)

ADD REPLY
5
Entering edit mode
12.9 years ago

There's an interesting discussion here: https://plus.google.com/107526144078068918726/posts/ZCBc8DH3yKK

ADD COMMENT
1
Entering edit mode
12.9 years ago

Here is a paper describing a scheme for comparative compression of genomes from the same species. You might want to figure out first why you want to compress your data. If you can generate some of these metrics such as quality scores on the fly, then try compressing data after the last step that is computationally intensive.

ADD COMMENT
0
Entering edit mode
12.5 years ago
fac2003 • 0

Goby 2.0 is a major milestone for the Goby project which brings state of the art NGS alignment compression as well as very robust SAM/BAM import exports. See the new tutorial ‘What’s new in Goby 2.0‘ for more information.

We created a summary table to compare features of Goby 1.x, 2.0, BAM, CRAM and FASTQ. Click here to see the full table. enter image description here

ADD COMMENT

Login before adding your answer.

Traffic: 1669 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6