How to deeply sequence long inserts
1
0
Entering edit mode
12 months ago
Ryan • 0

I have a DNA library encoding scFv antibody genes that consist of VH gene (~380 bp) + linker peptide (54 bp) + VL gene (~380 bp), and the library contains about 1M unique antibodies. We've performed some iterative selections on the library such that the final sample has an expected diversity of about 500-5K unique antibodies. However, we want to sequence each round of selection, starting with the unselected diverse library, and use the enrichment of unique sequences across the rounds of selection to inform some future experiments.

Previously, our libraries consisted of VH genes only and 2x250 bp NovaSeq runs worked really nicely, giving us the coverage and depth that we needed especially in the early rounds of selection when diversity is still high. However, this new library contains inserts of about 800-900 bp and I don't know how to go about sequencing it.

The read depth is super important for calculating the fold-enrichment during selection. The linker sequence between the VH and VL sequences should be invariant, and we thought about sequencing the VH and VL domains separately, but I'm not sure how to re-assemble which VH domain goes with which VL domain and this is biologically necessary.

Does anyone have any suggestions to balance coverage with read depth? Thank you in advance!!

Long-read phage NGS • 887 views
ADD COMMENT
0
Entering edit mode

Previously, our libraries consisted of VH genes only and 2x300 bp NovaSeq runs worked really nicely,

I had no idea the platform was capable of 2x300. I will recommend JGI starts doing that.

ADD REPLY
0
Entering edit mode

Sorry - total lapse of cerebral function. 2x250 on the NovaSeq. 2x300 on the MiSeq. We're not doing anything fancy there!

ADD REPLY
0
Entering edit mode

Merging paired reads is a good idea. Then you get nice, long reads... and actually, as long as you have enough coverage, you can just merge all of them.

"However, this new library contains inserts of about 800-900 bp and I don't know how to go about sequencing it."

BBMerge and Tadpole do allow you to extend reads (via the extendleft and extendright flags) which often allows them to overlap, so you can merge distant read pairs. But I am wondering about your post:

Does anyone have any suggestions to balance coverage with read depth? Thank you in advance!!

It does not seem to be related to your initial question.

ADD REPLY
0
Entering edit mode
12 months ago
sure ▴ 110

Linked-Read Sequencing: This approach, offered by platforms like 10x Genomics, provides a way to reconstruct long sequences without physically sequencing long fragments. Short reads are generated from longer molecules of DNA that are barcoded in such a way that they can be computationally reassembled into long sequences. Allows for phasing and structural variant detection that would be difficult with traditional short-read sequencing. This method still involves an initial fragmentation step, but it allows for the reconstruction of long sequences.

Other solutions like ONT will have relatively higher error rates and PacBio-HiFi will have good quality long reads but significant costs associated with it.

ADD COMMENT
0
Entering edit mode

Linked read kits have not been offered by 10x genomics from last 3-4 years so that is not an option.

Illumina now offers a long read technology that is similar to the 10x one mentioned above: https://sapac.illumina.com/science/technology/next-generation-sequencing/long-read-sequencing.html

AFAIK it is only supported for human samples for now and may be pricey for this application.

ADD REPLY

Login before adding your answer.

Traffic: 1760 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6