Hello,
I have males and females across time for various phenotypes. I first began by binning my data in 20 year increments.
Data$cuts <- cut(Data$year, breaks = c(seq(min(Data$year), max(Data$year), 20), max(Data$year)), labels = FALSE)
This now produces a cut or bin with a value from 1-8 for every individual in my dataset.
I then am trying to produce an output with the difference in mean between males and females in a trait for each bin of time.
for (i in 1:8) {
difmean <- c()
Mcuts <- DataM[ which(DataM$cuts=='i'),]
Fcuts <- DataF[ which(DataF$cuts=='i'),]
Mmean <- mean(Mcuts$trait, na.rm = TRUE)
Fmean <- mean(Fcuts$trait, na.rm = TRUE)
difmean <- c(Mmean-Fmean)
print (difmean)
}
I get an output of the following:
[1] NaN [1] NaN [1] NaN [1] NaN [1] NaN [1] NaN [1] NaN [1] NaN
Any help would be greatly appreciated!
Got it, you use
'i'
instead ofi
inDataM$cuts=='i'
and it's never the string'i'
Thank you!! It is working now, much appreciated.
Is there a way to assess significance of a linear model with binned data? I pasted some code below that generates the regression line, but I don't get p-values from the summary. Maybe I need to bootstrap and just look at confidence intervals?
I think you should start a new thread for that question
Do
Data
andDataM
andDataF
have the same number of rows? Istrait
a column inDataM
andDataF
?DataM and DataF have a different numbers of rows, but the same columns.
$trait
is a column in both datasets.DataM and DataF were generated like so:
Side note: Why use
which()
when just specifyingDataM<-Data[Data$sex=="M",]
would work just fine?You're right, it was just how I left it during processing.