Entering edit mode
7.9 years ago
vahapel
▴
210
Hi All,
We obtained the “.bax.h5” and “.fastq” files after PacBio sequencing service and I also converted “.bax.h5” to “.fasta” and “.fastq” file types using Pls2fasta tool. However, I could not get the same read numbers in fastq files (between service provided “.fastq” and my-converted “fastq files”) even if I changed the quality of reads using -trimByRegion and -minReadScore options. My question is that is there any tools for reporting the phred score values or cut-off value for a fastq files generated from PacBio ?
Many thanks in advance for all help !
I am not sure if you are asking if there is a tool to check quality of PacBio fastq data files or if you are asking how you can get the same results from your filtering as what you received from your sequence provider.
It is very likely that your sequence provider used SMRTportal (v.2.x or 3.x) to produce filtered fastq files. Easiest thing would be to ask them what filter parameters they use for producing these files.
Your fastq files may still have the SMRTbell adapters. You could scan/trim them using removesmrtbell.sh from BBMap like this"
removesmartbell.sh in=<input> out=<output> split=t
Hi genomax,
Thank you for your answer, I think problem is solved. First, I glance at the sequence length distribution and RQ value (reflecting read quality in fastq header) of service-provided ".fastq" files, noticing that minimum sequence length is set to 35 and quality value ranged from 75 to 80. After converting “.bax.h5” and “.fastq” via SMRT portal script "bash5tools.py" (with option of --minReadScore 0.75 --minLength 35), i get exactly same read numbers with service-provided ".fastq" files.
Thank you again !