bamToFastq on Sequel subreads.bam
1
0
Entering edit mode
3.8 years ago

Hi,

I got PacBio Sequel data sequenced in CLR mode. I used bedtools bamtofastq on the *subreads.bam files to extract the subreads in FASTQ format. Instead of subreads I got CLR though. Using PacBio's BAM2fastx tools I was able to extract the subreads. I was under the assumption that the subreads.bam files contain subreads and not CLR. Am I wrong or is something fishy either with bamtofastq or the data?

Thanks, Chris

PacBio bam subreads • 2.0k views
ADD COMMENT
0
Entering edit mode

I have the same question. Is bedtools bam2fastq is appropriate for converting pacbio bam file to fastq format?

ADD REPLY
0
Entering edit mode
2.1 years ago
Billy Rowell ▴ 330

There are a few points to address here:

1) The primary data type output by a CLR mode sequencing run is the subreads.bam file. For all intents and purposes, CLR reads are subreads.

2) PacBio subreads do not have a meaningful base quality score. The base quality scores are set to the ascii character !, the lowest value on the scale. Since these scores are not meaningful, it's not really meaningful to export these to FASTQ. I'd recommend FASTA instead.

3) I'd recommend bam2fasta (as suggested by @christian.dreischer) or samtools fasta to extract the subreads in FASTA format.

ADD COMMENT
0
Entering edit mode

Dear William, Thanks a lot for this clarification.

ADD REPLY

Login before adding your answer.

Traffic: 1876 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6