Thank you for looking at my question. I am trying to solve this homework question.
Consider the problem of sequencing genome by random reads. If G is the length of the
entire sequence, L is the length of the read and n is the number of reads, then coverage
is defined as nL/G . Now, if we want 50% of the original long sequence to be covered by
at least one fragment, how much coverage do we need?
I read Lander-Waterman http://www.genetics.wustl.edu/bio5488/lecture_notes_2005/Lander.htm model to understand the concept. But didn't quite get how to solve this problem. I thought to consider the given 50% as probability and y as 1 (the one from Poisson distribution) and calculate lambda (that is the coverage). But I don't think I am on right track. I thought of considering y as 1 because the question says 50% of the original long sequence to be covered by atleast one fragment, which means that those bases are sequenced atleast once.
I may be wrong. Maybe, I am not clear about the way to solve this.
Experts can you guide me please.
Thank you.
I think you are having problems because the question is ill-posed. There is a probability involved. It could be the case that the reads stack by chance in one position. So I would rephrase the question like so: we want 50% of the original long sequence to be covered by at least one fragment with probability P, how much coverage do we need? or like so: what is the coverage required such that the expected value for the number of positions covered at least once is at 50% of the genome length?
Also posted at http://stackoverflow.com/questions/8424854/finding-genome-coverage-using-random-reads.
And also posted at http://seqanswers.com/forums/showthread.php?t=16096.
Yeah. I got it. Thank you.
lol..I also posted it at http://www.daniweb.com/software-development/computer-science/threads/399022/1710513#post1710513