A genome assembly was under process. Earlier we had performed an assembly using the TELL-Seq linked-read technology. The assembly was unsatisfactory due to insufficient coverage. The assembler was Universal Sequencing Technology's Turing assembler, accessed through the TELL-Link pipeline.
Recently, we re-sequenced the stored sample, using NovaSeq X plus. These reads are not linked reads.
As an experiment, we clubbed the read files from both the sequencing experiments, in such a way that R1 from the Linked-read technology was clubbed with the R1 reads from the recent sequencing, and similarly for R2. I hope this can be termed a hybrid assembly. For the assembly process, we used clc Genomics workbench (version 23.0.5), using its fast method (or the fast algorithmic approach). Unfortunately, it failed with an error report.
Error report produced by clc Genomics
It appears that the data are not supported? I was quite doubtful about this hybrid experimental approach, in the beginning, but I was instructed to go ahead with the assembly process using the merged R1 and R2 files. The basis of my apprehension was that, previously, when the linked-reads alone were assembled using the TELL-Link pipeline, an index file containing all the unique barcodes (44 million) was also fed into the assembly run command, in addition to the read files. Before starting this merged-read assembly, I had queried the Universal Sequencing Company, on the prospects of a successful assembly with the merged reads, and they were like, the clc should do the work, or else they have an algorithm to perform the task. While I read about it, clc genomics does the hybrid assembly, where the hybrid means, assembly using “long reads” and polishing step using the “short reads”.
Hi Vijith,
I am going to respond with a few general comments. As it stands we cannot help you with your request.
For requests about commercial software products like CLC Bio etc. please refer to the support channel of the vendor. We cannot serve as an unpaid help desk for a commercial closed source product (at least not unless they pay us :).
Please do not post screen-shots (especially not ones taken with your phone camera), you can almost always copy-paste text. In your first screen shot all the relevant information is hidden. Even the CLC support hotline will not like that.
michael, I am extremely sorry about this post. In reality, I wasn't expecting a technical solution with CLC genomics. I was looking forward to means of tackling the problem by any other means, I mean if there are other pipelines or tools. Probably, my title hasn't conveyed properly what I was actually looking for. And also, my sincere apologies for posting unclean images/screenshots.
Hi, you can edit your question to better convey what you are looking for. If you are looking for an alternative to assemble TELL-seq data, state this in the title. Also, then there is no need for these screen shots.
Hope this helps Michael
If you are completely ignoring the special barcodes from TELL-seq why not simply try and assemble the data as if these were two technical sequencing replicates in CLC? How long are your TELL-Seq reads as shown in the post?
Hi, welcome. Some things: Please use a meaningful title, not the question as title. Use meaningful tags so experts can find your question. No images please, copy the context, paste it and embed via the code option (
10101
button), same goes for code and error messages. Right now your question is barely readable.