Hi, I am very new to genomics and currently running MaSurca. Just wanted to know if any polishing steps (pilon or racon) are suggested after assembly? Thank you for assistance in advance.
Sonia
Hi, I am very new to genomics and currently running MaSurca. Just wanted to know if any polishing steps (pilon or racon) are suggested after assembly? Thank you for assistance in advance.
Sonia
Masurca author says that Pilon polishing is not necessary and can even be bad for your assembly if parameters are not chosen cautiously.
The post is here: http://masurca.blogspot.com/2018/11/masurca-329-release.html
He uses a hybrid illumina-nanopore assembly as an example though, so I am not certain how much that applies in your case
That's right, still MaSuRCA author himself says that "there is no "polishing" required for MaSuRCA assemblies".
So I don't know if he got confused with the post he refers to or if he just considers that example relevant to MaSuRCA as well as Canu (the fact that MaSuRCA actually uses Canu as its main assembler makes me think of the second option).
To be honest I am in the same situation as the original poster (just used MaSuRCA to assemble illumina+nanopore reads) and am not sure whether I should polish it or not. What worries me most is that line: "while it can improve consensus statistics overall, it can worsen the assembly in some regions" which the way I interpret it is that one cannot rely on assembly statistics (eg. N50, etc) to know if the assembly is actually better after polishing than before... so yeah, if anyone has some advise on the matter I am also very curious.
I find it notoriously difficult to assess the quality of an assembly by limited by summary statistics like N50 or similar. In case you have no specific questions but just like to have a primary assembly I'd try a first round of polishing and see what it changes. Or you use variant calling to see what the differences are before polishing. I had a customer that was specifically interested in high precision assemblies of subtelomeric regions, so I made sure what ever I did improved the assembly there and tested a few settings and algorithms (it was a while ago and involved Masurca and eventually Pilon but I don't remember details)
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
I assume, you assemble some sort of long reads mixed with n short read libraries. Depending on your coverage, it might not be necessarily, but I would run it in any case at least once to see how much is polished. Any of the two programs is good, I have worked with pilon, so far.
I have paired end, mate-pair and pacbio reads. I will go ahead with pilon polishing then. Thanks.