Please find the insert size histogram, plotted using picard tool.
METRICS CLASS picard.analysis.InsertSizeMetrics
MEDIAN_INSERT_SIZE MODE_INSERT_SIZE MEDIAN_ABSOLUTE_DEVIATION MIN_INSERT_SIZE MAX_INSERT_SIZE MEAN_INSERT_SIZE STANDARD_DEVIATION READ_PAIRS PAIR_ORIENTATION WIDTH_OF_10_PERCENT WIDTH_OF_20_PERCENT WIDTH_OF_30_PERCENT WIDTH_OF_40_PERCENT WIDTH_OF_50_PERCENT WIDTH_OF_60_PERCENT WIDTH_OF_70_PERCENT WIDTH_OF_80_PERCENT WIDTH_OF_90_PERCENT WIDTH_OF_95_PERCENT WIDTH_OF_99_PERCENT SAMPLE LIBRARY READ_GROUP
279 273 36 20 41943682 283.265191 56.023413 197649736 FR 15 27 41 57 73 91 113 139 183 223 329
In addition, velvet predicted an insert size of 273 and std deviation 53 during contig assembly.
I have used the median size (279) and error of 0.5 to scaffold contigs with SSPACE.
Total inserted pairs = 220823893
LIBRARY reads STATS:
<h6>#</h6>MAPPING READS TO CONTIGS:
Number of single reads found on contigs = 61676131
Number of read-pairs used for pairing contigs / total pairs = 23693047 / 23853682
READ PAIRS STATS: Assembled pairs: 23693047 (47386094 sequences) Satisfied in distance/logic within contigs (i.e. -> <-, distance on target: 273 +/-136.5): 9029076 Unsatisfied in distance within contigs (i.e. distance out-of-bounds): 129266 Unsatisfied pairing logic within contigs (i.e. illogical pairing ->->, <-<- or <-->): 35528 --- Satisfied in distance/logic within a given contig pair (pre-scaffold): 1342274 Unsatisfied in distance within a given contig pair (i.e. calculated distances out-of-bounds): 13156903 --- Total satisfied: 10371350 unsatisfied: 13321697
Estimated insert size statistics (based on 9029076 pairs):
Mean insert size = 261
Median insert size = 260
Inserted contig file; Total number of contigs = 57206 Sum (bp) = 348942280 Total number of N's = 0 Sum (bp) no N's = 348942280 GC Content = 42.69% Max contig size = 123790 Min contig size = 500 Average contig size = 6099 N25 = 28352 N50 = 15957 N75 = 7400
**After scaffolding:**
Total number of scaffolds = 39590
Sum (bp) = 348541909
Total number of N's = 31358
Sum (bp) no N's = 348510551
GC Content = 42.92%
Max scaffold size = 244223
Min scaffold size = 500
Average scaffold size = 8803
N25 = 55732
N50 = 30443
N75 = 13267
My question is, can I increase the error value so as to include more paired reads for scaffolding? As per my understanding, we are using just 9029076 out of 220823893 pairs. How far can I go with the error value, without causing misassemblies?