Dear Members, I'm looking for downloading sequencing reads(3rd generation) whose length is above 1000bp. It would be helpful for me if anybody can provide me with proper links. Also, I would like to know that what should be the size of K when we assemble reads from 3rd generation sequencing technologies. I guess that K value should be large. Is my guess correct?
You don't normally use kmer-based (or purely kmer-based) assembly for single-molecule sequencing reads; the error rate is too high. Instead, you use all-to-all alignment and consensus.
Dear Brian, Is all-to-all alignment and consensus mean Overlap-Layout-Consensus(OLC) approach?
Dear Brian, I have read Kmer has many applications in many bioinformatics analysis (https://en.wikipedia.org/wiki/K-mer). So, I'm not concerned the length of Kmer only for assembly problem. Generally, for other bioinformatics applications, what would be the length of K-mers for the lengthier reads generated by 3rd generation sequencing machines such as nanopore and pacbio. Can we go for the kmer length above 520?
As I said, kmers are unsuitable for long single-molecule read assembly. Other approaches like OLC or string graphs are used. You're certainly welcome to try k=520 with long reads, and see what happens. But typically people use string-based assemblers like Falcon or Celera.
Thanks for your answer. I'll get back to you if I have other questions related to this.