I have a PacBio library that I have converted from bam to fastq.gz files. Now when I hope to use them to correct, trim and assemble I cannot figure out whether I should use both the subreads and scrap files or only the former. The rationale for leaving out the later is Subreads because of the description provided in SMRT Link User Guide (from PacBio website). It says "Subreads that contain information such as double-adapter inserts or single-molecule artifacts are not used in secondary analysis, and are excluded from this file and placed in scraps.bam."
So if I don't use in assembler ( e.g. CANU) or PacBio Correction Tool (e.g. LoRDEC), am I doing it correctly or should I use both subreads and scraps files. Please let me know if I am doing it wrong.
Thanks in advance.
Thank you for the explanation and the link.