[Search for users] [Overall Top Noters] [List of all Conferences] [Download this site]

Conference rusure::math

Title:	Mathematics at DEC

Moderator:	RUSURE::EDP

Created:	Mon Feb 03 1986
Last Modified:	Fri Jun 06 1997
Last Successful Update:	Fri Jun 06 1997
Number of topics:	2083
Total number of notes:	14613

1621.0. "Multiple Linear Regression" by VIZUAL::FINNERTY (The bug stops here) Wed Jun 03 1992 13:34

    
    Does anyone have a good implementation of a multiple-linear regression
    algorithm?  (question also posted in the ALGORITHMS conference).
    
       /Jim

T.R	Title	User	Personal Name	Date	Lines
1621.1	note 1255	VIZUAL::FINNERTY	The bug stops here	`Wed Jun 03 1992 16:25`	3
	note 1255 has a pointer to obtain a regression algorithm coded in FORTRAN... the source is a bit intimidating, though. :)
1621.2	Numerical Recipes	FASDER::MTURNER	Mark Turner * DTN 425-3702 * MEL4	`Thu Jun 18 1992 15:00`	4
	The "Numerical Recipes" book also has routines in Chap. 14. Mark
1621.3	serial correlation	SARAH::FINNERTY	The bug stops here	`Fri Jun 19 1992 10:13`	26
	on the topic of regression... I have some time series data with a single independent variable. Unfortunately, both the dependent variable and the independent variable show significant serial correlation. When the data is corrected for serial correlation, the fit of the equation is _much_ worse. What does this suggest? - That the data set size, N = 23, is too small to conclude much of anything - That the serial correlation is _negative_, that is, errors in time period T are negatively correlated with errors in time period T+1 (seems improbable) - ? statistical anomaly ? - ? and if apparent goodness of fit is the goal, should I ignore the fact that the data is serially correlated? /Jim
1621.4	focus on the serial correlation	MOCA::BELDIN_R	All's well that ends	`Fri Jun 19 1992 13:36`	15
	Assume a model like y(t) = a + b x(t). Then serial correlation implies that y(t+1) is related to y(t) and x(t+1) is related to x(t), each with their own linear relationship. It could be that the best explanation of the data is with a two variable vector model <x y>, both dependent on t, the single independent variable. In other words, maybe the serial correlation is all there is? /rab
1621.5	Independent variable?	CADSYS::COOPER	Topher Cooper	`Fri Jun 19 1992 15:55`	26
	RE: .3 (Jim) I'm not sure what it means to say that an independent variable shows a serial correlation. You choose the values of the independent variable. It sounds like you are really regressing two dependent variables against each other, with hidden independent variable(s) of either time and/or sequence. In that case I agree with .2 -- the predictive ability of one of your dependent variables on your other is primarily (or fully) "explained" by their common dependence on time/sequence. As to what you should do about it depends on what you mean that "apparent goodness of fit is the goal." If the apparent goodness of fit refers only to the data at hand -- if the regression is meant as a summary description of the existing data only -- then you can ignore it. If you want the apparent goodness of fit to apply to data collected in the future, than whether to ignore the correlation or not depends on whether that future data can be collected in a way that will maintain the correlation as is. You may be better off "predicting" both your variables from the temporal variable. By the way, you cannot apply standard linear regression to two sets of measurements, unless the measurement error is vanishingly small on your "X". Another reason to regress two dependent variables against a reliable temporal variable. Topher
1621.6	not forecastable from just 't'	VIZUAL::FINNERTY	The bug stops here	`Fri Jun 19 1992 17:37`	18
	well, it's slightly more complex than suggested, since the independent variable has a time lag, i.e. the independent variable is measured at (t-13), whereas the dependent variable is measured at (t); furthermore, the relationship between t and either of the variables is by no means linear. consecutive values of X(t) or Y(t) are not very different from each other, giving rise to the serial correlation; however, X(t-13) seems to be closely correlated with Y(t) {R� = .82, not accounting for serial correlation}. re: -.2 maybe serial correlation is all there is ...I'm still pondering over this... /Jim
1621.7	complications	MOCA::BELDIN_R	All's well that ends	`Fri Jun 19 1992 17:55`	32
	Ok, I'll describe a hypothetical situation and you decide if it helps you. Each week, I purchase raw materials to the tune of x(t) dollars. My product has approximately 13 weeks of lead time (no wonder I'm losing customers :-) ) and the total output, y(t), certainly should be related to how much I buy. So I hypothesize that y(t+13) = a + b x(t). Well, I have several alternative models: ("e" represents an error variable in each case) 1) y(t+13) = a + b f(t) + e x(t) = c + d f(t) + e eg, time is the controlling factor 2) x(t-13) = a + b y(t) + e because I used MRP to tell me how much to buy based on planned output 3) x(t) = a + b x(t-1) + e y(t) = c + d y(t-1) + e where (a,b) is "close" to (c,d) The statistical analysis for each of these is different. Traditional linear regression assumes a very much simpler model. 4) y = a + b x + e and the x's have no error. Any one of these (and some other variations) can be thought of as linear regression, but the standard linear regression analysis is only appropriate if the model is very like 4). Does that help or confuse the issue more? /rab
1621.8	the problem...	VIZUAL::FINNERTY	The bug stops here	`Fri Jun 19 1992 18:59`	30
	the problem being considered is prediction of the % movement in a stock market index; the independent variable is a measure of sentiment in the market: (as Topher suggested in -.3, the goal is accurate prediction as opposed to accurate fitting of historical data) 1) y(t+13) = a + b f(t) + e x(t) = c + d f(t) + e eg, time is the controlling factor Very improbable that y(t) can be predicted from t alone. 2) x(t-13) = a + b y(t) + e because I used MRP to tell me how much to buy based on planned output In this case we're predicting the past :) {we already know x(t-13), so this doesn't help us much} 3) x(t) = a + b x(t-1) + e y(t) = c + d y(t-1) + e where (a,b) is "close" to (c,d) This might be useful for projecting x out to future periods, and therefore allow y to be predicted a bit farther into the future... (but I'd be happy to predict y at all, at this point) 4) y = a + b x + e and the x's have no error. In this case, in fact, there is virtually no error in the measurement of x, so this model still seems reasonable.
1621.9	DDDDDD?	CADSYS::COOPER	Topher Cooper	`Fri Jun 19 1992 21:00`	17
	Is y the stock market index (in which case the serial correlation is characteristic of the delta index) or the delta (% movement) of the index (in which case the serial correlation is the delta of the delta of the index)? In any case it sounds like you should be looking at the correlation of delta-x to either y or delta-y. Unless you have a strong reason for believing that the relationship is linear, I would think seriously about throwing in a x^2 or delta-x^2 term as well. Better yet get a solid amount of data and eye-ball it to see what comes out. With only a first order term, a clean U or inverted-U (i.e., y increases(decreases) with increasing x for a while then decreases (increases)) comes out flat and unpredicted. You might also try the loess procedure (sometimes called non-parametric regression). Rule of thumb in model fitting -- if you don't have a lot of theory you need a lot of data or a lot of luck. Topher
1621.10	assessment of likelihood of success	SGOUTL::BELDIN_R	All's well that ends	`Mon Jun 22 1992 09:45`	27
	re .8 Topher has just triggered in my head, one of the standard techniques for getting started. Make tables of successive differences. If you can find relations among any pair of difference sets, you've got a start on the kind of model. On the other hand, just from the cynic's point of view, consider this. If it were possible to predict the DJ or any of its components, I would expect that somebody would be doing it already for a (big) profit. There have always been many professional and amateur speculators interested in that topic and willing to part with enough money to make solid scientific development of any good idea economically feasible. So, I believe your prospects for (economic) success are slight. On the other hand, you are bound to (re-)learn some interesting facts. From a scientific point of view, the stock market summarizes millions of transactions every day. The number of transactions makes it very difficult to believe that detailed movements can be predicted. Certainly there are many small effects due to a general trend which can be predicted, but day to day changes are like the wind direction and velocity in a storm. fwiw, /rab
1621.11		AUSSIE::GARSON		`Tue Jun 23 1992 00:14`	13
	re .10 > If it were possible to predict the DJ or any of its components, I would > expect that somebody would be doing it already for a (big) profit. Bear in mind also that your playing in the market affects that market - and the greedier you get the more the effect. Make sure that noone else has the benefit of your new found predictive techniques. Make sure that your techniques are not too good else you might find yourself on the wrong end of an insider trading charge. (-:
1621.12	Does the problem have a solution?	UNTADH::TOWERS		`Tue Jun 23 1992 05:00`	17
	Didn't the (now deceased) economist, Hayek, the father of monetarism, have something to say about this? Something which people in general and economists in particular have been wilfully ignoring for about 40 years? What he said was that any successful model must incorporate at least as much richness and complexity as that which it is trying to model. Since the flow of money is one aspect of human behaviour, a successful economic model must have the same level of complexity as the human mind. Hayek's conclusion was that economics as an exact science was a logical impossibility for humans. All that is possible is a rough approximation. Certainly, it seems unlikely that a linear, von Neuman (ie. current computing) model would be sufficient to generate predictions that would yield significant profits on the stock markets. Brian
1621.13		VMSDEV::HALLYB	Fish have no concept of fire.	`Tue Jun 23 1992 09:52`	10
	> Certainly, it seems unlikely that a linear, von Neuman (ie. current > computing) model would be sufficient to generate predictions that would > yield significant profits on the stock markets. Some floor traders such as Barry Haigh have made money year after year doing the same thing over and over for their own account. They have a model for how the market works on a very short-term basis and they profit from it. John
1621.14	building a theory	VIZUAL::FINNERTY	The bug stops here	`Tue Jun 23 1992 12:27`	23
	re: .9 "Y" is in fact delta-y in percent. re: if it was profitable, people would already be doing it. in fact, people _are_ doing this every day, whether or not it is more profitable than guessing. re: curve-fit model vs theoretically derived model I've often heard this criticism, and I must admit it does confuse me a little. Putting together a model takes time and effort... surely you wouldn't want to measure any random thing such as asrological conditions or the time of cherry trees blooming in Washington. So you construct a theory, gather some data, learn what the past has to tell you by doing some modelling and curve fitting, and then go back and reconsider your theory. /Jim
1621.15	how the stock market might also be modeled ?!	STAR::ABBASI	i^(-i) = SQRT(exp(PI))	`Tue Jun 23 1992 13:06`	2
	may be one can model stock market as a closed loop feedback control system with diststurbances thrown in, and noise modeled as stochastic processes.