|
|
 | | From: | Guybrush Treepwood | | Subject: | locally weighted regression | | Date: | Fri, 07 Jan 2005 06:21:35 GMT |
|
|
 | I'm using Vizier to draw plots of one-dimensional data, one input and one output. I made plots with different learners, NN, 3NN, global linear regression, quadratic regression through 3NN. The quadratic fits the data best. Does this say anyting about the relationship about the data. Does it mean the points are on a quadratic curve?
[ comp.ai is moderated. To submit, just post and be patient, or if ] [ that fails mail your article to , and ] [ ask your news administrator to fix the problems with your system. ]
|
|
 | | From: | modpra | | Subject: | Re: locally weighted regression | | Date: | Mon, 17 Jan 2005 01:56:40 GMT |
|
|
 | ( plz note i am not familiar with vizier) from what i think you are saying it seems like the following x = input y = output so y = f(x) and since your funtion happens to be quadratic the curve is also quadratic. althoi would take the data and run it through matlab or some other math program and curve fit the data hell even try excel if you want the reason i am saying this is that i am a little suspicious about the curve. here is the question i am thinking about if the curve is so simple (quadratic) why does it need a NN to generate the plot? another thing you should ask is how does a baysien fit work. cheers
[ comp.ai is moderated. To submit, just post and be patient, or if ] [ that fails mail your article to , and ] [ ask your news administrator to fix the problems with your system. ]
|
|
 | | From: | Ted Dunning | | Subject: | Re: locally weighted regression | | Date: | Mon, 17 Jan 2005 01:57:03 GMT |
|
|
 | Actually, without more information, it is impossible to say what your results mean.
You need to give just a bit more information such as how many data points you have and whether you can obtain more data to test any models that you create.
Here are a few scenarios:
a) you have tens of thousands of data points or more and can get more any time you like. This occurs often in signal processing applications. In such a situation, it seems likely that your quadratic fit results really does mean something. To test this without mathematics, look at the residuals on the training data and then look at the residuals on data that you didn't use in the regression or in the selection of regression models. If the average magnitude of the residuals is about the same in both cases (or better yet, the distribution is similar), then you probably have something.
b) you have hundreds of data points and getting more is difficult or impossible. Here things become murkier. You should institute a strict discipline of using only a portion of your data for trying different regressions and reserve two other portions, one to test a number of regressions for evaluating whichever model seems best. See below for references to mathematical techniques that can help you in cases where you can't hold data back.
c) you have a dozen to a few dozen data points. This situation is REALLY difficult to deal with. You probably can't judge between all of the models that you are describing and unless you luck into a model form (usually be deep knowledge of your system) that really works incredibly well, you are in a really difficult spot statistically speaking. You can falsify some regressions with this much data, but it is very difficult to derive models of any complexity that will work for unseen data.
If you are up for some serious thinking and are will to basically roll your own regression code, you might take a look at David Mackay's work on the evidence method in regression problems. Using such Bayesian techniques with code written by some random schmoe is pretty difficult, however.
Good luck.
[ comp.ai is moderated. To submit, just post and be patient, or if ] [ that fails mail your article to , and ] [ ask your news administrator to fix the problems with your system. ]
|
|
 | | From: | Guybrush Treepwood | | Subject: | Re: locally weighted regression | | Date: | Mon, 17 Jan 2005 22:11:57 GMT |
|
|
 | Ted Dunning wrote:
> Actually, without more information, it is impossible to say what your > results mean. > > You need to give just a bit more information such as how many data > points you have and whether you can obtain more data to test any models > that you create. > > Here are a few scenarios: > > a) you have tens of thousands of data points or more and can get more > any time you like. This occurs often in signal processing > applications. In such a situation, it seems likely that your quadratic > fit results really does mean something. To test this without > mathematics, look at the residuals on the training data and then look > at the residuals on data that you didn't use in the regression or in > the selection of regression models. If the average magnitude of the > residuals is about the same in both cases (or better yet, the > distribution is similar), then you probably have something. > > b) you have hundreds of data points and getting more is difficult or > impossible. Here things become murkier. You should institute a strict > discipline of using only a portion of your data for trying different > regressions and reserve two other portions, one to test a number of > regressions for evaluating whichever model seems best. See below for > references to mathematical techniques that can help you in cases where > you can't hold data back. > > c) you have a dozen to a few dozen data points. This situation is > REALLY difficult to deal with. You probably can't judge between all of > the models that you are describing and unless you luck into a model > form (usually be deep knowledge of your system) that really works > incredibly well, you are in a really difficult spot statistically > speaking. You can falsify some regressions with this much data, but it > is very difficult to derive models of any complexity that will work for > unseen data. > > If you are up for some serious thinking and are will to basically roll > your own regression code, you might take a look at David Mackay's work > on the evidence method in regression problems. Using such Bayesian > techniques with code written by some random schmoe is pretty difficult, > however. > > Good luck. > The situation is like this. It is a task for school, we get 20 datapoints and must use different regression based learners. The question is; which one performs best over to the inputpoints. >From what I see at the different plots, the quadratic regression through 3NN performs best. It is asked whether this then tells about the function with which the data were generated. But I can't find anything about that in our textbook. (Machine Learning, Mitchell, T.)
Following question, it is asked to let Vizier calculate the best learner. The program says global quadratic regression will be the best fit. The question here is, how does this relate to your intuition in the previous question.
But as said, I can't find anything in the book to answer the first question.
[ comp.ai is moderated. To submit, just post and be patient, or if ] [ that fails mail your article to , and ] [ ask your news administrator to fix the problems with your system. ]
|
|
|