Hi all,
I have OTU list with 73 samples (columns) and now I want to extract top 10 for each column including the respective annotation (genus level) provided at the end. This data shows a few only but actually i have hundreds of rows. The data looks like this (subset). Can you please help how to do this?
ID-a ID-b ID-c ID-d ID-e ID-f ID-g Genus
3709 5664 54 11 5 2 0 Methylocystis
1518 399 3 0 0 0 0 Kineococcus
15215 4016 46 3 2 0 0 Curvibacter
5 0 1 1 1 1 0 Sulfuritalea
13098 4699 47 5 3 13 10 Bradyrhizobium
5565 6451 2101 2774 1513 203 120 Dyella
The output would read like, separate arrangements of OTU annotation for each column (highest on top). And annotation may come next to each column. This can be done manually by sort function in excel but I would appreciate to know a quick smart way of doing this. Thanks
While it may indeed be possible to do this automagically in excel it will require VB scripting or something similar. You will likely get pushback on this for trying to use excel to do bioinformatics. It is not a great idea to do so.
You may want to export the data from excel into a delimited (comma or tab) format. At that point you can use unix
sort
(by column) to extract the info you need withcut
orawk
.