The DNA of my experiment was fragmented using sonification with Covaris LE220 and expected DNA fragment length is 450 bp, which I expect the average insert size for SOAP configuration file. When I calculated average insert size of my specific region of interest by using the information provided in http://www.cbs.dtu.dk/courses/27626/Exercises/denovo_exercise.php, (using Linux and R studio), I obtained the average insert size of my region of interest is only 322.72. I think It's a big difference. Now, what do you think which value should I use get proper de-novo assembly result. Indeed I am also looking for what would be effect of average insert size on proper de-novo assembly of a region of interest. Thank you
Assemblers only need average insert size to make sure that paired end info is used for assemblies. Depending on the configuration required for a tool, usually avg insert size with std deviation or insert size range has to be provided. More info about insert size can be found here: Insert Size And Fragment Size ?
Thank you Sej Modha. Still I am wondering which value to use, 450 (from method information) or 322.72 from Linux+R studio calculation. Does the difference in value affect the quality of assembly?
I'd use the calculated avg insert size.
Thank you for your kind information
As Sej commented, you should use the estimated insert size. However, if you are following that tutorial, for the initial assembly I would use "avg_ins=450", then estimate insert size to be used for "serious" assembly.
Hi, h.mon. According to tutorial, first I used 450 as initial insert size and followed the procedure in Linux and R. Finally I obtained insert size of 322.72. Do you mean I can use 322.72 now for final assembly? Thank you