im working on several .fasta files which in their name contains the chr_start_end.fasta which I want to iterate through and then extract the individual size for each fasta file. Then I want to use the size as an input to another command for each fasta file, so in the same for loop.
To extract the coordinates from the file I use as an example:
echo "chr10_126777139_126791124.fasta" | awk -F'[_.]' '{print $3-$2}'
which yields = 126791124 - 126777139 = 13985
Then I want to give the 13985 as an input to an genome assembly tool called canu. Like this example
canu -assemble -p asstest -d . -genomeSize=13985 -nanopore-raw chr10_126777139_126791124.fasta
I've tried this so far, but I cant get it to work properly.
for f in *.fasta; do Gen_size=$(echo "$f" | awk -F'[_.]' '{print $3-$2}') canu -assemble -p asstest -d . genomeSize=$Gen_size -nanopore-raw $f; done
I want to do this for several .fasta files at once, do any of you have any suggestions on how to pass one input to the next statement within the same for loop? Thanks?
It looks fine. What is the error you get?
Also, the line starting with "canu" should be on a new line.
You need a semi colon after your echo/awk command and before you call canu:
Thank you, nicely noticed.