Entering edit mode
23 months ago
Jimpix
▴
10
Hi!
I have folder with multiple fasta files, each file has few sequences like this:
>KLTH0E08624g KLTH0E08624g
MAREITDIKEFLELARRADVKTATVKINKKLNKSGKAFRQTKFKVRGSRYLYTLIVNDAG
I need to make a bash script which parse that files to get new headers in each fasta (first 4 letters):
>KLTH
MAREITDIKEFLELARRADVKTATVKINKKLNKSGKAFRQTKFKVRGSRYLYTLIVNDAG
and save these files in another folder. I am new in bash and and I can not handle it by myself. For now I have:
for f in $(ls path_to_folder/GL3*.fasta)
do
# here bash command to correct that headers and save in:
"/corrected/$f"
done
Kindly help
This is probably the most asked question on the forum - have you looked at other threads for ideas?
I do not know how to extract exactly first 4 letters. I have checked other posts but in is not clear for me.
Here's a hint (you want the first 5 chars if you intend to keep the header marker too (>) ):
"${string:0:5}"
https://stackoverflow.com/questions/8928224/trying-to-retrieve-first-5-characters-from-string-in-bash-error