Question

Which files from PacBio output files to use in downstream analysis in R

1

Entering edit mode

3.2 years ago

eimanpharmacist ▴ 20

I just received Pacbio sequences of 16S gene from the core facility. After download, I got 5 files for each sample with following suffix:

.ccs.bam
longest.bam
scraps.bam
subreads.bam
whitelist

From readings, I believe that I should start with ccs.bam files (demultiplexed aligned sequences) and convert them into fastq files for downstream analysis in R using DADA2 package, correct?

I installed bam2fastq on my Linux machine, and I am writing this post asking for support including links and materials about the protocol of what to do next.

Thanks!

R PacBio • 2.7k views

ADD COMMENT • link updated 20 months ago by Ram 44k • written 3.2 years ago by eimanpharmacist ▴ 20

1

Entering edit mode

I strongly recommend you to read PacBio file formats or asking your core facility. In order to get help for further analysis I suggest you to include details about what have you tried so far, what do you have, and what do you want to obtain from your data. Such analysis are not a trivial task.

ADD REPLY • link 3.2 years ago by Buffo ★ 2.4k

1

Entering edit mode

I would use samtools to convert the ccs.bam to fastq:

$ samtools fastq reads.ccs.bam >reads.ccs.fastq

and then feed that fastq file into DADA2 like the example here

ADD REPLY • link 3.2 years ago by gconcepcion ▴ 410

0

Entering edit mode

I have another question, I am converting ccs.bam files into fastq files one by one, and I am wondering if there is a way to do so in batch since I have lots of sequences.

ADD REPLY • link 3.2 years ago by eimanpharmacist ▴ 20

0

Entering edit mode

From readings, I believe

There is the mistake. Believes are for the church, not for analysis. Joke aside, ask the core facility what the files are how how exactly these were generated. At best they have someone with experience in the analysis that you might catch for a zoom call to explain how to get started. Towards the question on how to do that in batch, google loops in bash, e.g. for loop and spend some quality time on Unix basics.

ADD REPLY • link 3.2 years ago by ATpoint 85k