One Sided Contig Extension
0
0
Entering edit mode
7.7 years ago
jcnigg • 0

I am trying to determine the genome sequence of a virus with a genome comprised of multiple RNA segments. All of the segments should posses the same short sequence (5-10 nt) on the 5' end and a different short sequence on the 3' end. I know the full sequence of several of the segments, including the conserved motifs at the ends, but I have no information about at least three segments besides the motifs that should be at the ends. I am attempting to build putative contigs for the unknown segments from a paired-end illumina library based on the motif sequence. I filtered my library to retain only reads that begin with the 5' motif using grep:

grep '^motif' -B 1 -A 2 --no-group-separator in.fastq > out.fastq

I would like to use these reads as seeds for contig extension from my original unfiltered library using something like PRICE, but I want extension to only occur from one end of the seed (i.e. build the contig out from the 3' end of the seed, but retain the 5' end of the seed as the 5' end of the contig). Is there a way to accomplish this?

Assembly next-gen • 1.6k views
ADD COMMENT
0
Entering edit mode

might not perform exactly how you want, but you could give Mapsembler2 or Kollector a try

ADD REPLY
0
Entering edit mode

Thank you for your suggestions. These tools, along with PRICE, seem to do basically what I need. As you mentioned, they won't perform exactly as I want (assembly will extend out from both ends of the seed). It should be possible to run any of these programs and filter out from the resulting contigs only those that contain my desired motif as the 5' end. My concern here would be misassembly beyond the 5' end, but I'll give it a shot.

ADD REPLY

Login before adding your answer.

Traffic: 1235 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6