I've got many 96-well plates of samples (a library of vectors with various insert sequences, one per well), and they know that some of the wells are cross-contaminated. They are hoping to be able to distingush which wells have < 1% contamination. They have sanger end reads of each wells, covering the junction between the vector and the unique insert. Does anyone know of a software that could reliably detect a 1% contaminating signal in a sanger trace? I don't need to deconvolute the contaminant, but I would like to be able to say "These wells have < 1% contamiantion, these other ones are contaminated".
My guess is no, that sanger won't be any good at finding < 5% contamination, that some other technology must be brought to bear on this (maybe Illumina sequencing with overlapping pools), but I wanted to see if the existing sanger data could be used, and making my own parser for 4-channel trace data sounds awful.
Edit:
I don't think that's as easy as it sounds, and I don't think that's the problem anyway.
The problem is well A01 is supposed to be vector with insert sequence 1, and well B01 is supposed to be vector with insert sequence 2, etc, but I think poor plate handling has caused some wells to be cross contaminated, so that now wells have two different kinds of vector with insert in them, and this is a big problem for downstream applications.
I need to be able to say ">99% of the insert sequence in well A01 is what it's supposed to be, so you don't have to spend labor and $ to reprep that one. B01 is more contamianted than that, so you should repick and reprep that one."
maybe I am not understanding the question properly, but couldn't you just take the sequence produced from the 4-channel trace and mask out the vector?