How to pick top 50 the most abundant pathways?

0

Entering edit mode

3.4 years ago

arshad1292 ▴ 110

Hello I have generated a file with KEGG pathway abundance table (.tsv) that contains about 450 pathways and their abundance data. However, I need to pick top 50 the most abundant pathways to make a heatmap. Here is how my pathway abundance table looks like:

pathway_name sample1 sample2 sample3 sample4 sample5 sample6
A            250     1058    491     52      691     519
B            15      542     947     1165    847     1407

and so on....

Please send me a shell script, python or R code that can help me in selecting the top 50 most abundance pathways.

Many thanks in advance!

pathways abundant • 904 views

ADD COMMENT • link 3.4 years ago by arshad1292 ▴ 110

1

Entering edit mode

Without knowing anything about your system, or what you want to achieve, the quickest way would be to calculate the average abundance in the samples for each pathway (e.g. rowMeans()), rank the resulting values and select the top 50.