PacBio Fastq file

1

Entering edit mode

7.6 years ago

José ▴ 10

Hello, Last week I've received the genome PacBio data from a grass. I have a few questions about the data. -The provider give me a file with only the Filtersubreads (Filter=single pass and remove adapters) in Fastq format. There is ok or the must give me more data? maybe .h5? -The fastq subreads have an exclamation mark (The poorest value) as quality score, i don't know why, it's ok? I run FastQC and all the data have the same value.

The provider says me that the data have %85 of accuracy. It is ok for PacBio data, but i can't measure it.

Thank's

José

genome sequencing next-gen • 5.5k views

ADD COMMENT • link 7.6 years ago by José ▴ 10

0

Entering edit mode

What is it that you want to do with this data?

Grab a copy of the original (*.h5) data files. Those would be needed for some software/analyses. Also ask your provider to run RS_ReadsOfInsert workflow which will give you consensus sequence for those subreads.

ADD REPLY • link 7.6 years ago by GenoMax 147k

0

Entering edit mode

As genomax said, you'll want to dig deeper to make sure you are working with consensus sequence from the subreads. There are plenty of resources for working from .bax.h5 files:

Pacbio: extract fastq from h5 file based on quality filtering

Brent Wilson, PhD | Project Scientist | Cofactor Genomics 4044 Clayton Ave. | St. Louis, MO 63110 | tel. 314.531.4647 Catch the latest from Cofactor on our blog.

ADD REPLY • link 7.6 years ago by brent_wilson ▴ 140

Login before adding your answer.