Hey everyone,
When I want to count basepairs (A, C, G, T) on many fasta files, I usually use
for F in *.fna ; do N=$(basename $F .fna)_count_bps.txt ; grep -v ">" $F | wc | awk '{print $3-$1}' > $N ; done
However, if my fasta files have characters other than A, C, G, and T, these will be included in the total count. Is there a way to optimize my code, so that I only get the total count of A, C, G and T in each fasta file?
Thanks!