Entering edit mode
3.6 years ago
Swarna Kanchan
▴
10
Dear all,
I have a MOTIF.txt file contains MULTIPLE motifs (each line one motif, > 5000 MOTIFS)
for example
$ cat MOTIF.TXT
TCGFHAHH
GHHFDSJH
AND I HAVE A SEQUNCES.FASTA (> 10000 SEQUENCES) IN WHICH MOTIFS (MARKED BY BOLD AND ITALICS) MIGHT BE PRESENT AT ANY PLACE IN THE FASTA SEQUENCE
$ cat SEQUNCES.FASTA
>1
CCC***TCGFHAHH***
> 2
CC***TCGFHAH*H**GG
>3
TTT***GHHFDSJH***CC
NOW I WANT TO WRITE THE FREQUENCY OF MOTIF 1 (TCGFHAHH) THEN ALL THE FASTA SEQUNCES (INCLUDING HEADER) IN A NEW FILE. SIMILARLY FOR MOTIF2 IN THE SAME FILE AND SO ON
PLEASE HELP , SUGGEST HOW TO DO IT.
No need to SHOUT.
Use
seqkit grep
and/orseqkit locate
(LINK).Thank you.