Hi everyone
Recently I read a paper (Genetica (2009) 137:159-164), in which the authors generate pseudo-CDSs (in other words, negative dataset) based on intergenic sequences. They used an in-house script to randomly exact sequences from intergenic region. These pseudo-CDSs have the same number of sequences and a similar length distribution as genuine CDSs.
I am wondering if someone can share this kind of perl script with me. I appreciate your kind helps. Any comments (e.g. available tools) will be also helpful. Thank you very much!
Hi Pgibas, your answer is really helpful. Thanks a lot!