I have a file contents with specific pattern, I would like to split that file into multiple file after pattern match and file name should be with after pattern match words Examples
P1_1r6r
NRVSTVQQLTKRFSLGMLQGRGPLKLFMALVAFLRFLTIPPTAGILKRWGTIKKSKAINV
LRGFRKEIGRMLNILNRRRRRVSTVQQLTKRFSLGMLQGRGPLKLFMALVAFLRFLTIP
P1_1sfk
MALVAFLRFLTIPPTAGILKRWGTIKKSKAINVLRGFRKEIGRMLNILNRRRRRVSTVQQ LTKRFSLGMLQGRGPLKLFMALVAFLRFLTIPPTAGILKRWGTIKKSKAINVLRGFRKEI
P1_12562
RFSLPLKLFMALVAFLRFLTIPPTAGILKRWGTIKKSKAINVLRGFRKEIGRM LNILNRRRRRVSTVQQLTKRFSLGMLQGRGPLKLFMALVAFLRFLTIPPTAGILKRWGTI
So, here pattern is P1, I want to split the above file into 3 different files contenst with file name like 1r6r,1sfk,12562.
Thanks
your input format is not clear . is it fasta ?
with awk and sed: Input:
command:
output:
Note: All AA are in single line post identifier (each 2nd line after identifier)
probably a duplicate of How To Split One Big Sequence File Into Multiple Files With Less Than 1000 Sequences In A Single File ; How To Split A Multiple Fasta ; ...