Entering edit mode
7.6 years ago
fibar
▴
90
Are there available tools out there to go from an abundance matrix into a sort of original fasta file, conserving somehow the same information? The file looks like this:
sequence sample1 sample2 sample3 ...
actgg... 43 89 23 ...
actga... 03 53 19 ...
I also have identifiers for each sequence. The output would look like:
>sample1_readIDx
actgg...
>sample1_readIDx
actgg...
...
>sample1_readIDy
actga...
The first sequence should appear 43 times with a sample1 header, 89 times with a sample9, and so on.
Thanks Pierre. It run. However, it didn't print the headers as I described it in my post. I only see an underscore followed by a number. Were you thinking of an additional step afterwards?
yes because I did not understand the nature of this header. Feel free to modify this simple awk script.