Storing Fastq As Unaligned Bam
2
2
Entering edit mode
13.1 years ago
Abhi ★ 1.6k

Hey Guys

Just wondering if anyone there is now storing the raw fastq data as unaligned bam files. We are reaching a stage where any space we could potentially save would be beneficial.

Any con of storing fastq as bam? I see some discussion about this on seqanswers: http://seqanswers.com/forums/showthread.php?t=14941

Also any tools that people already have that converts a fastq to bam and vice-a-versa. I know there are few which can do bam to fastq like picard but not sure if fastq to bam is there.

Thanks!

Abhi

bam fastq • 10k views
ADD COMMENT
0
Entering edit mode

This doesn't read like a stackexcahnge question to me - if you want a discussion why not continue on SEQanswers?

ADD REPLY
0
Entering edit mode

This seems relevant to me, handling large NGS data files is an increasing bioinformatic issue.

ADD REPLY
2
Entering edit mode
13.1 years ago

I think a bgzipped fastq file will be always smaller than a BAM file as the BAM file also contains the positions of the alignments.

See also:

http://bioinformatics.oxfordjournals.org/content/early/2011/01/19/bioinformatics.btr014.abstract

Compression of genomic sequences in FASTQ format

and

http://genome.cshlp.org/content/early/2011/01/18/gr.114819.110

Efficient storage of high throughput sequencing data using reference-based compression

ADD COMMENT
1
Entering edit mode
13.1 years ago
toni ★ 2.2k

Hi Abhi,

yes we do this in our team. You can use Picard 'FastqToSam' utility.

Compared to 2 fastq files (plain, not gzipped as suggested by Pierre), an unaligned BAM file allows to save 60°% to 65% of storage space. It is also practical because you can store some useful information (Sample, Library, Run, any useful comments ...) in the header if you want to.

ADD COMMENT

Login before adding your answer.

Traffic: 1968 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6