Dear all, I want to help, if possible
How do I change the taxonomy using cut (shell), awk and seed:
my actual taxonomy, has no abbreviation code on the begin, I would like to put the code (d: domain, p: phylum, etc) in each taxonomy level,
For example
I have: Eukaryota;__Opisthokonta;__Metazoa;__Arthropoda;__Hexapoda;__Insecta
I need to look like this: d__ Eukaryota;__Opisthokonta;p__Arthropoda;__Hexapoda;c__Insecta
,
I think how to connecting awk and sed pipe, but do not know how to pass the values of the variable awk to sed;
The green gene database already inserted such abbreviations, but I'm using the database silva does not contain this. its difficult the visualization of plots results. Any suggestion or help?
Thanks!
You'll end up needing to create tables of domains, phyla, etc. for this to work. That ends up making things complicated enough that you're better off using python or perl, since awk/sed start getting pretty unwieldy.