T.R | Title | User | Personal Name | Date | Lines |
---|
1621.1 | note 1255 | VIZUAL::FINNERTY | The bug stops here | Wed Jun 03 1992 16:25 | 3 |
|
note 1255 has a pointer to obtain a regression algorithm coded in
FORTRAN... the source is a bit intimidating, though. :)
|
1621.2 | Numerical Recipes | FASDER::MTURNER | Mark Turner * DTN 425-3702 * MEL4 | Thu Jun 18 1992 15:00 | 4 |
| The "Numerical Recipes" book also has routines in Chap. 14.
Mark
|
1621.3 | serial correlation | SARAH::FINNERTY | The bug stops here | Fri Jun 19 1992 10:13 | 26 |
|
on the topic of regression...
I have some time series data with a single independent variable.
Unfortunately, both the dependent variable and the independent variable
show significant serial correlation.
When the data is corrected for serial correlation, the fit of the
equation is _much_ worse. What does this suggest?
- That the data set size, N = 23, is too small to conclude
much of anything
- That the serial correlation is _negative_, that is, errors
in time period T are negatively correlated with errors in
time period T+1 (seems improbable)
- ? statistical anomaly ?
- ?
and if apparent goodness of fit is the goal, should I ignore the
fact that the data is serially correlated?
/Jim
|
1621.4 | focus on the serial correlation | MOCA::BELDIN_R | All's well that ends | Fri Jun 19 1992 13:36 | 15 |
|
Assume a model like y(t) = a + b x(t).
Then serial correlation implies that y(t+1) is related to y(t) and
x(t+1) is related to x(t), each with their own linear relationship.
It could be that the best explanation of the data is with a
two variable vector model <x y>, both dependent on t, the single
independent variable.
In other words, maybe the serial correlation is all there is?
/rab
|
1621.5 | Independent variable? | CADSYS::COOPER | Topher Cooper | Fri Jun 19 1992 15:55 | 26 |
| RE: .3 (Jim)
I'm not sure what it means to say that an independent variable shows a
serial correlation. You choose the values of the independent variable.
It sounds like you are really regressing two *dependent* variables
against each other, with hidden independent variable(s) of either time
and/or sequence. In that case I agree with .2 -- the predictive
ability of one of your dependent variables on your other is primarily
(or fully) "explained" by their common dependence on time/sequence.
As to what you should do about it depends on what you mean that
"apparent goodness of fit is the goal." If the apparent goodness of
fit refers only to the data at hand -- if the regression is meant as a
summary description of the existing data only -- then you can ignore
it. If you want the apparent goodness of fit to apply to data
collected in the future, than whether to ignore the correlation or not
depends on whether that future data can be collected in a way that will
maintain the correlation as is. You may be better off "predicting"
both your variables from the temporal variable.
By the way, you cannot apply standard linear regression to two sets of
measurements, unless the measurement error is vanishingly small on
your "X". Another reason to regress two dependent variables against a
reliable temporal variable.
Topher
|
1621.6 | not forecastable from just 't' | VIZUAL::FINNERTY | The bug stops here | Fri Jun 19 1992 17:37 | 18 |
|
well, it's slightly more complex than suggested, since the independent
variable has a time lag, i.e. the independent variable is measured at
(t-13), whereas the dependent variable is measured at (t); furthermore,
the relationship between t and either of the variables is by no means
linear.
consecutive values of X(t) or Y(t) are not very different from each
other, giving rise to the serial correlation; however, X(t-13) seems to
be closely correlated with Y(t) {R� = .82, not accounting for serial
correlation}.
re: -.2 maybe serial correlation is all there is
...I'm still pondering over this...
/Jim
|
1621.7 | complications | MOCA::BELDIN_R | All's well that ends | Fri Jun 19 1992 17:55 | 32 |
| Ok, I'll describe a hypothetical situation and you decide if it helps
you.
Each week, I purchase raw materials to the tune of x(t) dollars. My
product has approximately 13 weeks of lead time (no wonder I'm losing
customers :-) ) and the total output, y(t), certainly should be related
to how much I buy. So I hypothesize that y(t+13) = a + b x(t).
Well, I have several alternative models: ("e" represents an error
variable in each case)
1) y(t+13) = a + b f(t) + e
x(t) = c + d f(t) + e eg, time is the controlling factor
2) x(t-13) = a + b y(t) + e because I used MRP to tell me how
much to buy based on planned output
3) x(t) = a + b x(t-1) + e
y(t) = c + d y(t-1) + e where (a,b) is "close" to (c,d)
The statistical analysis for each of these is different. Traditional
linear regression assumes a very much simpler model.
4) y = a + b x + e and the x's have no error.
Any one of these (and some other variations) can be thought of as
linear regression, but the standard linear regression analysis is only
appropriate if the model is very like 4).
Does that help or confuse the issue more?
/rab
|
1621.8 | the problem... | VIZUAL::FINNERTY | The bug stops here | Fri Jun 19 1992 18:59 | 30 |
|
the problem being considered is prediction of the % movement in
a stock market index; the independent variable is a measure of
sentiment in the market: (as Topher suggested in -.3, the goal
is accurate prediction as opposed to accurate fitting of
historical data)
1) y(t+13) = a + b f(t) + e
x(t) = c + d f(t) + e eg, time is the controlling factor
Very improbable that y(t) can be predicted from t alone.
2) x(t-13) = a + b y(t) + e because I used MRP to tell me how
much to buy based on planned output
In this case we're predicting the past :) {we already know
x(t-13), so this doesn't help us much}
3) x(t) = a + b x(t-1) + e
y(t) = c + d y(t-1) + e where (a,b) is "close" to (c,d)
This might be useful for projecting x out to future periods,
and therefore allow y to be predicted a bit farther into the
future... (but I'd be happy to predict y at all, at this point)
4) y = a + b x + e and the x's have no error.
In this case, in fact, there is virtually no error in the
measurement of x, so this model still seems reasonable.
|
1621.9 | DDDDDD? | CADSYS::COOPER | Topher Cooper | Fri Jun 19 1992 21:00 | 17 |
| Is y the stock market index (in which case the serial correlation is
characteristic of the delta index) or the delta (% movement) of the
index (in which case the serial correlation is the delta of the delta
of the index)? In any case it sounds like you should be looking at
the correlation of delta-x to either y or delta-y. Unless you have a
strong reason for believing that the relationship is linear, I would
think seriously about throwing in a x^2 or delta-x^2 term as well.
Better yet get a solid amount of data and eye-ball it to see what comes
out. With only a first order term, a clean U or inverted-U (i.e.,
y increases(decreases) with increasing x for a while then decreases
(increases)) comes out flat and unpredicted. You might also try the
loess procedure (sometimes called non-parametric regression).
Rule of thumb in model fitting -- if you don't have a lot of theory you
need a lot of data or a lot of luck.
Topher
|
1621.10 | assessment of likelihood of success | SGOUTL::BELDIN_R | All's well that ends | Mon Jun 22 1992 09:45 | 27 |
| re .8
Topher has just triggered in my head, one of the standard techniques
for getting started. Make tables of successive differences. If you
can find relations among any pair of difference sets, you've got a
start on the kind of model.
On the other hand, just from the cynic's point of view, consider this.
If it were possible to predict the DJ or any of its components, I would
expect that somebody would be doing it already for a (big) profit.
There have always been many professional and amateur speculators
interested in that topic and willing to part with enough money to make
solid scientific development of any good idea economically feasible.
So, I believe your prospects for (economic) success are slight. On the
other hand, you are bound to (re-)learn some interesting facts.
From a scientific point of view, the stock market summarizes millions
of transactions every day. The number of transactions makes it very
difficult to believe that detailed movements can be predicted.
Certainly there are many small effects due to a general trend which can
be predicted, but day to day changes are like the wind direction and
velocity in a storm.
fwiw,
/rab
|
1621.11 | | AUSSIE::GARSON | | Tue Jun 23 1992 00:14 | 13 |
| re .10
> If it were possible to predict the DJ or any of its components, I would
> expect that somebody would be doing it already for a (big) profit.
Bear in mind also that your playing in the market affects that market -
and the greedier you get the more the effect.
Make sure that noone else has the benefit of your new found predictive
techniques.
Make sure that your techniques are not *too* good else you might find
yourself on the wrong end of an insider trading charge. (-:
|
1621.12 | Does the problem have a solution? | UNTADH::TOWERS | | Tue Jun 23 1992 05:00 | 17 |
| Didn't the (now deceased) economist, Hayek, the father of monetarism,
have something to say about this? Something which people in general
and economists in particular have been wilfully ignoring for about 40
years?
What he said was that any successful model must incorporate at least
as much richness and complexity as that which it is trying to model.
Since the flow of money is one aspect of human behaviour, a successful
economic model must have the same level of complexity as the human mind.
Hayek's conclusion was that economics as an exact science was a logical
impossibility for humans. All that is possible is a rough approximation.
Certainly, it seems unlikely that a linear, von Neuman (ie. current
computing) model would be sufficient to generate predictions that would
yield significant profits on the stock markets.
Brian
|
1621.13 | | VMSDEV::HALLYB | Fish have no concept of fire. | Tue Jun 23 1992 09:52 | 10 |
| > Certainly, it seems unlikely that a linear, von Neuman (ie. current
> computing) model would be sufficient to generate predictions that would
> yield significant profits on the stock markets.
Some floor traders such as Barry Haigh have made money year after year
doing the same thing over and over for their own account. They have a
model for how the market works on a very short-term basis and they
profit from it.
John
|
1621.14 | building a theory | VIZUAL::FINNERTY | The bug stops here | Tue Jun 23 1992 12:27 | 23 |
|
re: .9
"Y" is in fact delta-y in percent.
re: if it was profitable, people would already be doing it.
in fact, people _are_ doing this every day, whether or not it is
more profitable than guessing.
re: curve-fit model vs theoretically derived model
I've often heard this criticism, and I must admit it does confuse me
a little. Putting together a model takes time and effort... surely
you wouldn't want to measure any random thing such as asrological
conditions or the time of cherry trees blooming in Washington. So
you construct a theory, gather some data, learn what the past has
to tell you by doing some modelling and curve fitting, and then go
back and reconsider your theory.
/Jim
|
1621.15 | how the stock market might also be modeled ?! | STAR::ABBASI | i^(-i) = SQRT(exp(PI)) | Tue Jun 23 1992 13:06 | 2 |
| may be one can model stock market as a closed loop feedback control system
with diststurbances thrown in, and noise modeled as stochastic processes.
|