Question

How to deeply sequence long inserts

0

Entering edit mode

23 months ago

Ryan • 0

I have a DNA library encoding scFv antibody genes that consist of VH gene (~380 bp) + linker peptide (54 bp) + VL gene (~380 bp), and the library contains about 1M unique antibodies. We've performed some iterative selections on the library such that the final sample has an expected diversity of about 500-5K unique antibodies. However, we want to sequence each round of selection, starting with the unselected diverse library, and use the enrichment of unique sequences across the rounds of selection to inform some future experiments.

Previously, our libraries consisted of VH genes only and 2x250 bp NovaSeq runs worked really nicely, giving us the coverage and depth that we needed especially in the early rounds of selection when diversity is still high. However, this new library contains inserts of about 800-900 bp and I don't know how to go about sequencing it.

The read depth is super important for calculating the fold-enrichment during selection. The linker sequence between the VH and VL sequences should be invariant, and we thought about sequencing the VH and VL domains separately, but I'm not sure how to re-assemble which VH domain goes with which VL domain and this is biologically necessary.

Does anyone have any suggestions to balance coverage with read depth? Thank you in advance!!

Long-read phage NGS • 1.6k views

ADD COMMENT • link updated 19 months ago by Ram 45k • written 23 months ago by Ryan • 0

0

Entering edit mode

Previously, our libraries consisted of VH genes only and 2x300 bp NovaSeq runs worked really nicely,

I had no idea the platform was capable of 2x300. I will recommend JGI starts doing that.

ADD REPLY • link 23 months ago by Brian Bushnell 20k

0

Entering edit mode

Sorry - total lapse of cerebral function. 2x250 on the NovaSeq. 2x300 on the MiSeq. We're not doing anything fancy there!

ADD REPLY • link 23 months ago by Ryan • 0

0

Entering edit mode

Merging paired reads is a good idea. Then you get nice, long reads... and actually, as long as you have enough coverage, you can just merge all of them.

"However, this new library contains inserts of about 800-900 bp and I don't know how to go about sequencing it."

BBMerge and Tadpole do allow you to extend reads (via the extendleft and extendright flags) which often allows them to overlap, so you can merge distant read pairs. But I am wondering about your post:

Does anyone have any suggestions to balance coverage with read depth? Thank you in advance!!

It does not seem to be related to your initial question.

ADD REPLY • link 23 months ago by Brian Bushnell 20k

score 0 · Answer 1 · 2023-11-28

0

Entering edit mode

23 months ago

sure ▴ 110

Linked-Read Sequencing: This approach, offered by platforms like 10x Genomics, provides a way to reconstruct long sequences without physically sequencing long fragments. Short reads are generated from longer molecules of DNA that are barcoded in such a way that they can be computationally reassembled into long sequences. Allows for phasing and structural variant detection that would be difficult with traditional short-read sequencing. This method still involves an initial fragmentation step, but it allows for the reconstruction of long sequences.

Other solutions like ONT will have relatively higher error rates and PacBio-HiFi will have good quality long reads but significant costs associated with it.

ADD COMMENT • link 23 months ago by sure ▴ 110

0

Entering edit mode

Linked read kits have not been offered by 10x genomics from last 3-4 years so that is not an option.

Illumina now offers a long read technology that is similar to the 10x one mentioned above: https://sapac.illumina.com/science/technology/next-generation-sequencing/long-read-sequencing.html

AFAIK it is only supported for human samples for now and may be pricey for this application.

ADD REPLY • link 23 months ago by GenoMax 154k