Comparative boxplots with R
4
0
Entering edit mode
10.1 years ago
l0ka ▴ 10
sample    NAF           TAF
00450     0.5211098     0.5310629
00450     0.5193542     0.5286942
00450     0.5262824     0.5199457
00450     0.5230269     0.5252758
00450     0.5169092     0.5223112
00160     0.5221299     0.5324644
00160     0.5319794     0.5531024
00160     0.5233437     0.5358770
00160     0.5242215     0.5224607
00160     0.5152723     0.5229491
00810     0.5127049     0.5222062
00810     0.5263669     0.5320754
00810     0.5157763     0.5267149
00810     0.5433680     0.5671679
00810     0.5242678     0.5248383

Hi, I have a big data frame like this one above. I have to do a boxplot to compare NAF with TAF, by sample name. There are around 100 different samples, so I should split the data. How should I do?

With this code:

boxplot(NAF ~ sample, TAF ~ sample, data=data, las=2, varwidth=T)

It plots only NAF, not TAF...

R • 6.0k views
ADD COMMENT
0
Entering edit mode

I think I had done it once , by lattice graphics, I can't remember full code but the basic one which could help you could be like this

#melt the data
library(reshape2)
data=melt(data)

library(lattice)
bwplot(sample ~ NAF | TAF, data=data)

Just play around with parameters of bwplot

ADD REPLY
3
Entering edit mode
10.1 years ago
Martombo ★ 3.1k

My suggestion would be to use the packages reshape2 and ggplot2. With those you can easily reassemble the data and plot it in a nicer way than the R default.

You just need the following commands:

library(ggplot2)
library(reshape2)

reshaped_data=melt(data,id="sample")

p=ggplot(reshaped_data,aes(x=as.factor(sample),y=value,fill=variable))

p+geom_boxplot()
ADD COMMENT
2
Entering edit mode
10.1 years ago

There are many ways to do this.

For example, with ggplot2:

library(ggplot2)
d = read.table('mydataset.csv', sep='\t', header=T)
ggplot(d) + geom_boxplot(aes(x='NAF', y=NAF)) + geom_boxplot(aes(x='TAF', y=TAF)) + facet_wrap(~sample, ncol=2) + theme_bw() + scale_x_discrete('x axis label') + scale_y_continuous('value')

An alternative is to reshape your dataset in a 'long' format:

library(reshape2)
d.long = melt(d, id.vars='sample')
ggplot(d, aes(x=variable, y=value)) + geom_boxplot()
# you can even use lattice
library(lattice)
bwplot(value~variable|sample, d.long, layout=c(2,2))

You can use the ncol and nrow parameters in facet_wrap (or with the layout argument if you use lattice) to adjust the number of panels displayed in each row/column. If you have a lot of samples, the best thing to do is to add another column classifying the samples into smaller groups, and then plot one page per group.

ADD COMMENT
0
Entering edit mode

heh I was 2 mins late...

ADD REPLY
0
Entering edit mode

you posted while I was writing :),

My comment is useless now :)

ADD REPLY
0
Entering edit mode

Sorry, next time you will be faster than me ;-)

ADD REPLY
0
Entering edit mode

Thanks, it works very well!

ADD REPLY
1
Entering edit mode
10.1 years ago
Siva ★ 1.9k

You can try BoxPlotR which generates side-by-side box plots. This tool can be run online or locally.

ADD COMMENT
1
Entering edit mode
10.1 years ago
EagleEye 7.6k

I hope this will work:

input <-read.table('mydata.txt', sep='\t', header=T)
mydata_frame <- data.frame(values=c(input[,2],input[,3]),vars = rep(c("NAF","TAF"), times = c(length(input[,2]),length(input[,2]))))
vars = rep(c("NAF","TAF"), times = c(length(input[,2]),length(input[,2]))))
values=c(input[,2],input[,3])
boxplot(values ~ vars, data = mydata_frame)
ADD COMMENT

Login before adding your answer.

Traffic: 2256 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6