Today one assignment in a course I'm doing was finding all (CCG)*4 repeats in the Q arm of human chromosome 11. Since I'm having a little too much time on my hands after having done it with EMBOSS's fuzznuc I wanted to try a bash-only version an came up with
cat 11q.fa | sed '1d' | tr -d '\n' | tr -d '\r' | egrep -io '(CCG){4}' | wc -l
fuzznuc came up with 18 matches, the bash version only with 11. Neither does take into account reverse complement matches by default.
I suppose fuzznuc is correct, but can anyone spot an error in the bash version?
(edit: it's called fuzznuc, not fuzzynuc)
give us the data and parameters to fuzzynuk and the output, then maybe if somebody has way too much time we'll figure something out ;)