Averaging multiple columns
2
1
Entering edit mode
5.4 years ago
vinayjrao ▴ 250

Hi,

I have a file with five columns, from which I want the mean of the last four columns where the first rows of the first column are the same. For example, my file looks like this -

A 1 1 1 1
A 1 3 1 1
A 1 1 2 3
B 5 7 2 4
C 2 1 5 1
C 2 2 3 6

The desired output is -

A 1 1.7 1.3 1.7
B 5 7 2 4
C 2 1.5 4 3.5

Any help would be appreciated.

Thanks.

R shell • 1.2k views
ADD COMMENT
2
Entering edit mode
5.4 years ago
import pandas as pd
df = pd.read_csv("tmp.txt", sep="\t", header=None)
df
    0   1   2   3   4
0   A   1   1   1   1
1   A   1   3   1   1
2   A   1   1   2   3
3   B   5   7   2   4
4   C   2   1   5   1
5   C   2   2   3   6

df.groupby(0).mean()

1   2   3   4
0               
A   1.0 1.666667    1.333333    1.666667
B   5.0 7.000000    2.000000    4.000000
C   2.0 1.500000    4.000000    3.500000

df.groupby(0).mean().to_csv("tmp_mean.txt", sep="\t", header = None)
ADD COMMENT
0
Entering edit mode

Thank you for your help. It works perfectly.

ADD REPLY
1
Entering edit mode
5.4 years ago
zx8754 12k

Using R:

# example data
df1 <- read.table(text = "
A 1 1 1 1
A 1 3 1 1
A 1 1 2 3
B 5 7 2 4
C 2 1 5 1
C 2 2 3 6
", header = FALSE)

aggregate(df1[ -1 ], df1[ 1 ], FUN = mean)
#    V1 V2       V3       V4       V5
# 1  A  1 1.666667 1.333333 1.666667
# 2  B  5 7.000000 2.000000 4.000000
# 3  C  2 1.500000 4.000000 3.500000
ADD COMMENT
0
Entering edit mode

Thank you for your reply. I am still working on it.

Will update once done.

ADD REPLY

Login before adding your answer.

Traffic: 2644 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6