I'm working on a workflow tool to help with PCR assay sequence testing and I'd like to understand a bit more about how these three sequences are generated.
My tool takes assay sequences and a query term and compares the assay sequences against a query performed on the NCBI nucleotide database and then outputs an html report highlighting mismatches, including a consensus obtained on the query data using Biopython's motifs consensus method.
Is the consensus useful in generating the assay sequences? Are the primers typically unchanging or rarely changing sections of the organism's DNA?
I'm just a hobby programmer working on a project for a friend, feel free to point me somewhere else if this forum is inappropriate for this kind of simple question.
Thanks!
Thanks for your answer. I'm going to have to look up a lot of these terms, haha.
My test case (to see that the workflow tool performs as required) is human RSV. I'm guessing my friend wants to use it for infectious diseases as he mentioned RSV and some bacteria as well.
How do those differ from testing for a genetic disease? Does it matter if it's virus or bacteria, or is it related to the size/rate of mutation of the organism? Other factors?
Microorganisms evolve much faster than eukaryotes etc so your choice of primers are even more critical for them.
That said, there are (at least in bacteria) some relatively static sections such as ribosomal sequences (this is the basis of tools like 16S profiling and MultiLocus Sequence Typing (MLST). If you want to read up, MLST would be a good place to start.
Choice of primers will matter for technical reasons (e.g. primer pairs and probes ideally need to have similar binding temperatures (a.k.a melting temperatures).
Is this something you're building for your own interest or for a practical purpose? Not to rain on your parade but there is a lot of important biological subtlety to these types of assays/workflows and mature tools do already exist.
I have a friend whose work includes testing assays for microorganisms. He currently has to do these steps by hand:
He wanted a way to automate all three steps into one tool, so I'm doing that for him using his feedback and test case (he provided the query term and primer/probe sequences).
What my tool does is the following:
I'm just using python to string together the steps of his work using existing tools. Because I am not a subject matter expert, it's just a fun project for me that I hope is a useful tool that can save some time in my friend's workflow.
Rain on my parade in this case is perhaps encouraged since I know so little about the subject.
Thanks!
I would suggest taking a look at tools like Primer3 (there is a biopython module for this I believe), which is built specifically for designing and optimising primers from query sequences and will guard against many of the biological pitfalls.
It may even allow you to do the websearch itself, but if not,
entrez
is the right way to go.As a note for your friend, I would also be surprised if there aren't already published primer sets for clinically significant pathogens like RSV, so he too may be reinventing the wheel, but perhaps he's ahead of me here.
Thanks for the suggestions. RSV has just been the test case to make sure the script works as expected. He hasn't asked me to help him with designing the primers or probe so I haven't looked into it at all. Really, it's just reducing the amount of manual busy work he has to do which I guess includes checking whether primers are still valid for a given pathogen. Really appreciate all the replies here at Biostar!
Edit: I believe the primers/probe he gave me for testing are published ones, where testing means, 'does the script produce the expected results', not trying to create new primers or probes or anything.