Is there a way to do it? Sorry for the uninformative question, so I have downloaded an SRA file from NCBI and used included sratoolkit to split the file into two fastq sequences. I am trying to do a de novo assembly using these paired-end strand_specific reads. However, a required parameter is the average insert size. Does anyone know how to obtain this from an SRA file or fastq?
Please describe your question so people can help you. I think I understand what your asking, but without more information it is difficult to answer.
Edited, thanks.
You will need to align the reads (both pairs). Then you can find the insert lengths by parsing the SAM/BAM file.
Align the reads to a reference genome? This seems counterintuative considering the whole point of a de novo assembly is to not need a reference.
Good point. Sorry. I need to read more carefully. I don't know the answer. I look forward to seeing the best solution.