How many base pairs would I need confidently map an arbitrary DNA sequence to a location in the human genome?
Mathematically, it seems like 16 bases should do the trick (since 4^16 = 4 billion, which is more than the number of mapping locations in the human genome). Is this borne out in the real world? I know that repetitive sequences, etc. might complicate this analysis.
The old rule (when I used to design PCR primers by hand) was 18-22 bases was a good measure as it had a reasonable chance of being unique, as well as having an appropriate Tm for PCR