Dear all,
Do you recommend merging pair-end Illumina reads for Transcript De-Novo assembly? Thank you for any advice and sharing your experiences..
Dear all,
Do you recommend merging pair-end Illumina reads for Transcript De-Novo assembly? Thank you for any advice and sharing your experiences..
It all depends on how many reads overlap If the majority of the reads overlap then there is little that should be gained from treating them as paired end. In fact it should be counterproductive to do so as the system has to deal with more and redundant data.
Logic dictates that providing more information to the system ought to make it perform better. In this case the extra information is that the reads are overlapping and an external tool solved that problem.
Now in reality and practice, the way algorithms are built, tuned and released, depending on the tool and version it just might be that you end up with unexpected performance when choosing one option vs the other. Hence as Manvendra Singh suggests I think it is best to be cautious and evaluate both methods.
Try using MeFit (https://github.com/nisheth/MeFiT). We have found it to work the best for getting overlapping reads. Other options is FLASH (http://ccb.jhu.edu/software/FLASH/).
This really depends on the assembler (as well as the merging program!). I have found that merging reads improves Ray assemblies, and makes Soap assemblies dramatically worse.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Yes, I got better results after joining the reads, but some bioinformaticians would be probably disagree. So , better is to go for de novo assembly by both ways, then check the average contig length, total number of contigs and length of assembly and then you can argue that which one worked better for you
hth
Thank you for comment - probably it would be the best solution...