Question

Question: Optimize the parameter span for loess (or spar for smooth.spline)

0

Entering edit mode

10.5 years ago

wenbo • 0

Hi,

Does anyone know well about the parameter optimization for loess or smooth.spline?

In function loess or smooth.spline, the parameter span (or spar for smooth.spline) is very important for the fitting. I have tried to use the following method to optimize the parameter spar for smooth.spline:

tuneSpline = function(x,y,span.vals=seq(0.1,1,by=0.05),fold=10){
    mae <- numeric(length(span.vals))
    fun.fit <- function(x,y,span) {smooth.spline(x = x,y = y,spar = span)}
    fun.predict <- function(fit,x0) {predict(fit,x0)$y}

    ii = 0
    for(span in span.vals){
        ii <- ii+1
        y.cv <- crossval(x,y,fun.fit,fun.predict,span=span,ngroup = fold)$cv.fit
        fltr <- !is.na(y.cv)
        save(fltr,y.cv,y,file="tmp.rda")
        mae[ii] <- mean(abs(y[fltr]-y.cv[fltr]))
    }
    span <- span.vals[which.min(mae)]
    return(span)
}
require(graphics)
attach(cars)
tuneSpline(speed,dist,fold = length(dist))
## return 0.1

But the optimized spar by this method is always 0.1 (the minimum value in span.vals).

It's weird and I think the result may not be right.

Can anyone help me about this issue?

Best regards!

XianWu

loess R smooth.spline • 7.2k views

ADD COMMENT • link updated 3.3 years ago by Ram 45k • written 10.5 years ago by wenbo • 0

0

Entering edit mode

Your code contains a couple of typos, namely missing argument 'y' on the third line, and missing opening bracket on the line with 'fltr' variable assignment. After fixing these, I ran the code and got 0.75 as the optimal value.

PS: I don't see how this question relates to bioinformatics, it's more suitable for StackOverflow

ADD REPLY • link updated 3.3 years ago by Ram 45k • written 10.5 years ago by lomereiter ▴ 500

0

Entering edit mode

Pay attention to what you're doing. Why would you want to 'tune' the span of loess?

loess is used to make a line that doesnt exactly hit all the points; if you are measuring accuracy as how far away the points are from the line, then it follows that the optimal parameter is approaching zero. Now you've connected all the dots and made a zigzag. People use loess because they want a smooth curve that may miss points with assumed error, an optimal value is subjective simplicity; or maybe the 2nd derivative (curve sharpness)

ADD REPLY • link updated 3.3 years ago by Ram 45k • written 10.5 years ago by karl.stamm 4.1k

1

Entering edit mode

It's a valid method of non-parametric estimation because of the use of cross-validation. See e.g. http://www.uab.ro/auajournal/upload/49_539_Nicoleta_Breaz-2.pdf for details.

ADD REPLY • link updated 3.3 years ago by Ram 45k • written 10.5 years ago by lomereiter ▴ 500

0

Entering edit mode

if you run smooth.spline with the same data but without passing a spar parameter, is the fitted smoothing parameter that is returned by R less than 0.1?

Where did you get crossval from? The params for crossval::crossval are (predfun, X, Y, K, B, verbose, ...) so your ordering doesn't match that

Please ignore this comment

ADD REPLY • link updated 3.3 years ago by Ram 45k • written 10.5 years ago by russhh 5.8k

Ram · Answer 1 · 2015-01-27

0

Entering edit mode

10.5 years ago

russhh 5.8k

With a working implementation you're approach works fine.

tuneSpline = function(x,y,span.vals=seq(0.1,1,by=0.05),fold=10){
    require(bootstrap)
    fun.fit <- function(x,y,span) {smooth.spline(x = x,y = y,spar = span)}
    fun.predict <- function(fit,x0) {predict(fit,x0)$y}

    mae <- sapply(span.vals, function(span){
      y.cv <- bootstrap::crossval(
        x,y,fun.fit,fun.predict,span=span,ngroup = fold
        )$cv.fit
      fltr <- which(!is.na(y.cv))
      mean(abs(y[fltr]-y.cv[fltr]))
      })
    span.vals[which.min(mae)]
}

attach(cars)
tuneSpline(speed,dist,fold = length(dist))

# 0.75

ADD COMMENT • link updated 3.3 years ago by Ram 45k • written 10.5 years ago by russhh 5.8k

0

Entering edit mode

Do you think this method is good?

ADD REPLY • link updated 3.3 years ago by Ram 45k • written 10.5 years ago by wenbo • 0

0

Entering edit mode

When I used smooth.spline(speed,dist)$spar, I got the value of spar is "0.7801305". It's very similar to the value of tuneSpline. It looks like that smooth.spline can do the optimization and select the best spar value for fitting.

ADD REPLY • link updated 3.3 years ago by Ram 45k • written 10.5 years ago by wenbo • 0