How Can I Increase The Length Of Existing Genome Contigs Using Paired-End Reads?
4
0
Entering edit mode
11.8 years ago
aswad87 • 0

Hypothetical scenario:

I am interested in a genomic region ~100 Kbp long. I have Identified (via BLAST of a similar sequence) approximately half of this in the form of contigs from a draft assembly. some of these contigs overlap, but they are mostly discontiguous with respect to the whole sequence.

Does anyone know of a way to use the paired-end reads to 'fill-in the gaps', using the known contigs as a starting point? The idea would be to avoid reassembling the whole genome. The only reference I have is from a very distantly related sequence, so cannot be relied upon for mapping.

Thank you in advance!

genomics assembly contigs scaffolding scaffolding • 3.5k views
ADD COMMENT
1
Entering edit mode
11.8 years ago

This sounds like a basic scaffolding approach. Map the reads using the contigs as your target and see how many read pairs map to different contigs. Those will indicate contigs that are ordered in a certain way. This will work if the contigs are no more than an insert size away from one another.

Probably there is a tool that can do this for you. Now that I wrote that you could alsor perhaps attempt to assemble you paired end reads, then once done assemble again the old contigs with the new ones.

ADD COMMENT
1
Entering edit mode
11.8 years ago

We use ATLAS_GapFill for this task (as does the Broad, and others)

ADD COMMENT
1
Entering edit mode
11.8 years ago
Leszek 4.2k

I have used SSPACE with success. It accepts multiple libraries (pair-end, mate-pair) and gave me quite good results. It also fills the gaps. For gap closing you can also try GapCloser from SOAPdenovo package.

ADD COMMENT
0
Entering edit mode
11.8 years ago
aswad87 • 0

Thanks for your answers! I will try all the suggestions!

ADD COMMENT

Login before adding your answer.

Traffic: 2775 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6