For example, there is a case that, I have the temperature data (0, 10, 20 and 30C) and electric power data of these points. Then I predict the electric power of the tomorrow as the temperature is 27C.
If there are no problems about overfitting , extrapolation and prediction interval , I confirm what I can study from data.
But in my experience, if real temperature is just 27C, the prediction may miss.
In many cases the cause is "time depended extrapolation."
In the case above, "time depended extrapolation" is that "Something affects the electric power. When I make the prediction model, the parameter was constant. But it is changed."
"Something is changed" is the "extrapolation". But it is not the meaning of general extrapolation because I do not know what is "something."
I do not know "something." But I know the information of time. The sample data is the range among the past. The prediction is out of the range.
I call that "time depended extrapolation" in this page.
Time depended extrapolation is the daily problem because "The sample data is the range among the past. The prediction is out of the range." is not the special case of the prediction.
It is a daily problem. But it is not discussed daily because we cannot study the phenomena by the sample data. And the analysis of the problem is difficult.
In some cases, after the failure of the prediction, I find the omen in the sample data.
In the field of the quality control, there is a daily problem that the quality changes because "something" unknown changes.
If the problem happens, at first, I analyze the timing of the problem. Then I analyze the related phenomena of the timing. Field works and interview are useful. It is important that the key is the timing.
If I find phenomena, then I think the relationship between the problem and phenomena.
In many cases, if the difference between the time between the timing of prediction and the timing of sample data is long, the failure of prediction is bigger.
The method, that updating the model one after another using new data, is often used for the prediction of Control or SPC . One of the easy ways is that to make rules such as "Using recent three months data for the model."
If the model is not time depended, we found the law of the nature. Checking the reproducibility is important to find the law.
If we understand the cause of the failure, we can use some data analysis method.
The analysis is that "To find temporary truth (not universal truth)." It is a kind of Data Mining . It is used for short-term strategy and the hint of cause-and-effect analysis.
And even if we cannot make the model for long-term data, if we can make the model for short-term data, we find the reason why we cannot make for long-term.
Statistical Way of Making Hypothesis
Selection of Methods
Outlier and Missing Value
NEXT Software for PredictionTweet