Entering edit mode
3.9 years ago
2021yearsold
•
0
My aim was to retrieve genes containing a consensus motif from a file containing genes in fasta format. I solved that with the help of Seqkit tool. For ex,
If my motif is something like this, I can write,
seqkit grep -srip 'G[TA][ATC]AGCA[TAC]' input.fasta > output1.fasta
Some of the motifs that I have contains several ambiguous bases and I am not sure the matching region. So my question here,
- How can I get the target matching sequence of consensus motif?
Edit:
What is the function of -srip in the above command? Because when I use
grep -o "G[TA][ATC]AGCA[TAC]" input.fasta > output2.fasta
Not all fasta files in output1.fasta have corresponding motif from output2.fasta
seqkit grep -srip xxx
equals toseqkit grep -s -r -i -p xxx
Thanks @shenwei356. Solution to my problem was in the -s option, which searched both strands. Now I use -P to use only one strand.