I want to sequence a plant genome whose estimated genome size is of 600 Mb. My question is how to decide:
- number and type of libraries (paired end and mate pair) on different platforms (PacBio, Illumina)
- amount data required
- read length
- insert size (mate pair)
- coverage
An example of the same is given below
The idea is to perform a hybrid assembly as the plant genome is complex and contains repeats. I went through several different papers for genome assembly of related plant species however, I don't know how to decide how many different type of libraries should be prepared.
The objective of this post is to have a general understanding on how to take decisions for the same? I know that this largely depends on cost as well however I am interested in the technical part (calculation)
Whilst having no experience with this I believe for a genome of that size you can get a decent assembly by combining long reads from nanopore or pacbio with high quality reads from illumina.