Is there a tool that gives me back all possible sequences of a consensus sequence?
A simple example:
Input:
ACTCAYT
Output:
ACTCACT ACTCATT
Is there a tool that gives me back all possible sequences of a consensus sequence?
A simple example:
Input:
ACTCAYT
Output:
ACTCACT ACTCATT
deiupac - A quick and dirty script, but it will do what you want. Although, I have to agree with h.mon, I don't really see, why you would want to recreate every possible options
# it requires git clone https://github.com/BioInf-Wuerzburg/perl5lib-Fasta.git # just clone it and make it available, e.g. by putting it into PERL5LIB export PERL5LIB=/path/to/perl5lib-Fasta/lib:$PERL5LIB; >s1 CCTGAGGTCC >s2 CCrGAGGTCC >s3 CCrGAGbTCC # converted back to >s1.1 CCTGAGGTCC >s2.1 CCaGAGGTCC >s2.2 CCgGAGGTCC >s3.1 CCaGAGcTCC >s3.2 CCaGAGgTCC >s3.3 CCaGAGtTCC >s3.4 CCgGAGcTCC >s3.5 CCgGAGgTCC >s3.6 CCgGAGtTCC
I have a sample containing three amplified genes. For each two different degenerated primers (fwd and rev) were used. Now I want to use blast to separate the reads of the three different genes.
The sequences of the degenerated primers is in consensus sequence and blast does not take consensus sequences.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
No tool that I am aware of, but a simple Perl or Python script could do this for you.
edit: however, how to phase when you have more than one ambiguous base? There are 2^n possibilities, e.g.:
ACWCAYT
There are four possibilities:
ACACACT
ACTCATT
ACACATT
ACTCACT
Which ones to choose?
Well, I would like to create a fasta file with all possible sequences. I do not want to choose one, but I need all of them. And a script that can create all seems not easy to me.
What is the number of sequences, length of sequences and the number of ambiguous bases per sequence? File sizes and number of sequences will grow very quick and easily explode on your face.