How to clean PacBio reads
1
0
Entering edit mode
2.7 years ago
Matteo Ungaro ▴ 100

Hi all,

I just got PacBio reads for several human individuals; however, some of them seems to conform to what their name says (e.g. HiFi), whereas others seems to be CLR. After asking in my lab, my colleagues told me if that is the case I will need to manually filter HiFi sequences out of those CLR reads.

Someone came acorss this issue before? I was wondering whether there is an simple/easy way to do it. Sorry for asking but I'm quite new to the field. Thanks in advance.

PacBio HiFi CLR • 2.1k views
ADD COMMENT
2
Entering edit mode
2.7 years ago
Billy Rowell ▴ 330

It sounds like you have a reads.bam, which is a format that includes one representative sequence for every productive ZMW, regardless of whether this ZMW is of HiFi quality. You can read more about the reads.bam format in the documentation for ccs here. The short version is that you can use the simple tool extracthifi to filter reads.bam down to only the HiFi reads.

extracthifi input.reads.bam output.hifi_reads.bam

If you have any more questions, the best way to get an answer is to email support@pacb.com.

ADD COMMENT
0
Entering edit mode

Thanks a lot for the clarification! That's exactly the case; in fact, I both have access to the reads.bam and to the fastq.gz.

However, without being aware of this procedure I downloaded the fastq.gz file... is there a workaround even with this format, or I will need to download the bam file?

I'm asking this because the next step will be running hifiasm with long range Hi-C data to fully-phase these sequences and I'm not sure on how (and if) the tool accepts bam files.

Thanks again,

Matteo

ADD REPLY
0
Entering edit mode

It's really hard to say without looking at your files.

If you reach out to support@pacb.com, they can try to help answer that question.

ADD REPLY

Login before adding your answer.

Traffic: 1870 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6