Question

Overlap File for RACON Genome Polisher

1

Entering edit mode

19 months ago

andorjkiss ▴ 50

This seems like a dumb question, but I have no idea how to generate an overlap file that is required for RACON. According to the documentation: https://github.com/isovic/racon

Racon takes as input only three files: contigs in FASTA/FASTQ format, reads in FASTA/FASTQ format and overlaps/alignments between the reads and the contigs in MHAP/PAF/SAM format. Output is a set of polished contigs in FASTA format printed to stdout. All input files can be compressed with gzip (which will have impact on parsing time).

I have PacBio CLR data, and I have Illumina PE data. I would like to use the illumina data to polish the pacbio data.

How do I generate the "overlap" file? If I use MHAP, I get a output file in <*.dat> format. How do I convert this into an MHAP format?

minimap2 overlap racon MHAP • 1.1k views

ADD COMMENT • link updated 19 months ago by andres.firrincieli 3.8k • written 19 months ago by andorjkiss ▴ 50

3

Entering edit mode

Racon can be used as a polishing tool after the assembly with either Illumina data or data produced by third generation of sequencing. The type of data inputed is automatically detected.

Disclaimer: I never used RACON as genome polishing tool. For this task, you should use tools specifically designed for genome polishing (eg NextPolish).

However, if you really want to use RACON as a genome polishing tool I think you should provide the following input files:

<sequences> = Illumina reads
<overlaps> = SAM file of the Illumina reads aligned to PacBio assembly file
<target sequences> = PacBio assembly file

ADD REPLY • link 19 months ago by andres.firrincieli 3.8k