Question

how to map Pacbio CCS fastq

0

Entering edit mode

3.1 years ago

g744695539 • 0

I have a Pacbio CCS fastq like this

I want to map to genome, and this is my command and out. enter image description here

I want to know how to solve it. Is this fastq correct? Thanks

minimap2 Pacbio • 1.6k views

ADD COMMENT • link updated 3.1 years ago by Billy Rowell ▴ 330 • written 3.1 years ago by g744695539 • 0

score 1 · Answer 1 · 2021-10-19

It might pay to recreate the CCS reads, the qualities don't look right - are they the same length for every read, although reads are different lengths? (this is what I would do).

Try fastqc and multiqc on the data.

If you want to continue using them and don't care about the qualities, you can convert to FASTA using eg seqtk https://github.com/lh3/seqtk and map using eg blasr, minimap2 as you've done.

score 0 · Answer 2 · 2021-10-19

Something is definitely odd about the FASTQ file. The length of the quality strings doesn't match the length of the sequence strings. The quality string length is constant even though the sequence lengths vary. The read names are also a little suspicious. Typically, CCS read names have the format moviename_date_time/zmw. These names were produced by some non-standard workflow.