I have a pair of read files where each is a mixed pool of reads from the Linked-read technology and Illumina short-read technology. Any suggestion on appropriate assembler that can handle this?
0
0
Entering edit mode
13 months ago
Vijith ▴ 90

A genome assembly was under process. Earlier we had performed an assembly using the TELL-Seq linked-read technology. The assembly was unsatisfactory due to insufficient coverage. The assembler was Universal Sequencing Technology's Turing assembler, accessed through the TELL-Link pipeline.

Recently, we re-sequenced the stored sample, using NovaSeq X plus. These reads are not linked reads.

As an experiment, we clubbed the read files from both the sequencing experiments, in such a way that R1 from the Linked-read technology was clubbed with the R1 reads from the recent sequencing, and similarly for R2. I hope this can be termed a hybrid assembly. For the assembly process, we used clc Genomics workbench (version 23.0.5), using its fast method (or the fast algorithmic approach). Unfortunately, it failed with an error report.

Error report produced by clc Genomics

It appears that the data are not supported? I was quite doubtful about this hybrid experimental approach, in the beginning, but I was instructed to go ahead with the assembly process using the merged R1 and R2 files. The basis of my apprehension was that, previously, when the linked-reads alone were assembled using the TELL-Link pipeline, an index file containing all the unique barcodes (44 million) was also fed into the assembly run command, in addition to the read files. Before starting this merged-read assembly, I had queried the Universal Sequencing Company, on the prospects of a successful assembly with the merged reads, and they were like, the clc should do the work, or else they have an algorithm to perform the task. While I read about it, clc genomics does the hybrid assembly, where the hybrid means, assembly using “long reads” and polishing step using the “short reads”.

NGS clc-genomics illumina WGS • 1.3k views
ADD COMMENT
1
Entering edit mode

Hi Vijith,

I am going to respond with a few general comments. As it stands we cannot help you with your request.

  • For requests about commercial software products like CLC Bio etc. please refer to the support channel of the vendor. We cannot serve as an unpaid help desk for a commercial closed source product (at least not unless they pay us :).

  • Please do not post screen-shots (especially not ones taken with your phone camera), you can almost always copy-paste text. In your first screen shot all the relevant information is hidden. Even the CLC support hotline will not like that.

ADD REPLY
0
Entering edit mode

michael, I am extremely sorry about this post. In reality, I wasn't expecting a technical solution with CLC genomics. I was looking forward to means of tackling the problem by any other means, I mean if there are other pipelines or tools. Probably, my title hasn't conveyed properly what I was actually looking for. And also, my sincere apologies for posting unclean images/screenshots.

ADD REPLY
1
Entering edit mode

Hi, you can edit your question to better convey what you are looking for. If you are looking for an alternative to assemble TELL-seq data, state this in the title. Also, then there is no need for these screen shots.

Hope this helps Michael

ADD REPLY
1
Entering edit mode

If you are completely ignoring the special barcodes from TELL-seq why not simply try and assemble the data as if these were two technical sequencing replicates in CLC? How long are your TELL-Seq reads as shown in the post?

ADD REPLY
0
Entering edit mode

Hi, welcome. Some things: Please use a meaningful title, not the question as title. Use meaningful tags so experts can find your question. No images please, copy the context, paste it and embed via the code option (10101 button), same goes for code and error messages. Right now your question is barely readable.

ADD REPLY

Login before adding your answer.

Traffic: 1973 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6