Hi folks,
I received pacbio sequences produced by sequel system and wish to use the sequences for scaffolding Contigs derived from assembling Illumina data.
I found sequence IDs in the subreads file, which were extracted from subreads.bam file are in the format like this:
"
m54196_171108_070652/6357972/7419_7633 .... ....
m54196_171108_070652/6357972/14702_18160
m54196_171108_070652/6357972/60591_64120
m54196_171108_070652/6357972/64172_66716
"
I guess these sequences are fragments of the same molecule sequenced in the same single ZMW hole. Is this correct? If this is a right guess, should i join them together and used them for genome scaffolding?
Kind Regards,
Elzed
thank you very much Istvan. I agree with you after i read and started to understand pacbio sequencing terminology http://files.pacb.com/software/smrtanalysis/2.2.0/doc/smrtportal/help/!SSL!/Webhelp/Portal_PacBio_Glossary.htm
I will go for ccs even though based on sequence IDs there are not so many molecules got sequenced multiple time.