Entering edit mode
3.4 years ago
arshad1292
▴
110
Hello I have generated a file with KEGG pathway abundance table (.tsv) that contains about 450 pathways and their abundance data. However, I need to pick top 50 the most abundant pathways to make a heatmap. Here is how my pathway abundance table looks like:
pathway_name sample1 sample2 sample3 sample4 sample5 sample6
A 250 1058 491 52 691 519
B 15 542 947 1165 847 1407
and so on....
Please send me a shell script, python or R code that can help me in selecting the top 50 most abundance pathways.
Many thanks in advance!
Without knowing anything about your system, or what you want to achieve, the quickest way would be to calculate the average abundance in the samples for each pathway (e.g.
rowMeans()
), rank the resulting values and select the top 50.Ok thanks that makes sense.