Hello,
I have a set of contigs that I would like to re-assemble using an Overlap-Layout-Consensus (OLC)-type approach. Let's say the minimum contig size in the set is 1000 bp. If I understand correctly, the following commands should: 1) find all overlaps between 100-999 bp, and store that overlap information in the file INPUT-1.dot; 2) clean up the overlap graph, ignoring contigs in the file REPEATS.fa, writing the assembled unambiguous paths to the file INPUT-1.path and the new graph to INPUT-2.dot; then 3) merge unambiguously overlapping contigs using the information computed in the first two steps and write the merged sequences to INPUT-2.fa.
AdjList -v -k1000 -m100 INPUT-1.fa > INPUT-1.dot
abyss-filtergraph -v -k1000 --dot --no-SS --assemble -g INPUT-2.dot -i REPEATS.fa INPUT-1.dot INPUT-1.fa >INPUT-1.path
MergeContigs -v -k1000 -o INPUT-2.fa --merged INPUT-1.fa INPUT-2.dot INPUT-1.path
I know this is probably quite a crude approach, but would it achieve what I am trying to do? Any thoughts or suggestions otherwise are appreciated. I am happy to clarify any points that are unclear, if needed.
Thanks,
Dan
Thanks Ben! Quick follow-up question: how are transitive edges and shims defined? Just wondering whether or not I should aim to filter those out of the overlap graph prior to layout and consensus.