I want to use the Mac OSX terminal to clip repetitive sequences from the beginning of the each sequence in a fasta file. For example, I would like to make the following file:
>seq1
CCCCAAAACCCCATGATCATGGATC
>seq2
CCCCAAAACCCCATGGCATCATTCA
>seq3
CCCCAAAACCCCATGTTGCTACTAG
become:
>seq1
ATGATCATGGATC
>seq2
ATGGCATCATTCA
>seq3
ATGTTGCTACTAG
by clipping off the CCCCAAAACCCC at the beginning of each sequence. Is there a way I can do this in the OSX terminal?
Why don't you use something like fastx? http://hannonlab.cshl.edu/fastx_toolkit/index.html
That seems pretty handy. Thank you for the tip. I'll try it out.