## Wolfer sunspot numbers, 1770 to 1869

The best way to learn how to forecast is to actually do it. Each student in the course will be required to a forecasting project that will count for 30% of the total marks in the course. The purpose of the project is to give the student “hands-on” experience with the methodological tools which a professional forecaster uses for the preparation of actual forecasts.

Pure time series methods (Box-Jenkins, exponential smoothing and Time series decomposition models) usually rely on the past history of the forecast variable itself for predicting the future values of the variable. Other procedures (Transfer function, vector autoregression and classical regression models) use two or more variables for the formulation efficient forecasts. For the purposes of the project, univariate forecasting methods will be sufficient.

The project should comprise of a detailed type description, not exceeding 15 pages, about student’s experience with the application of alternative univariate forecasting methods. It must contain an introductory statement justifying why the variable chosen is interesting from a forecasting perspective. It should highlight the (a) time series properties of the forecast variable, supported by suitable plots and graphs (b) the identification, estimation and diagnostic procedures used to select the best model (c) interpretation of the forecast results and their simple graphical presentation for the benefit of the potential (non-specialist) users of the forecast (d) comparative evaluation of the forecast accuracy of alternative models that have considered and (e) a concluding section discuss the limitations of the results and describe any insights the student might have gained about the underlying data generating mechanism for the forecast variable.

GUIDELINES FOR TIME SERIES MODELLING

A time series analysist does not begin an a priori model which she/he believes best describes the data at hand. Instead she/he studies data patterns to determine the best model. The difficulty with this approach is that in general there may exist several different models which may provide useful description of the data. The time series analyst must apply scientific search procedures in order to select the model that best describes the historical pattern of the time series variable under study. The fitted model is then extrapolated in order to generate forecast of the actual future values of the variable. The accuracy of alternative forecasting methods (models) is tested by confronting the model’s predictions with actual future values of the variable. This is achieved by splitting the sample into two parts: (a) an estimation part and (b) a test part. If these ‘out-of-sample’ forecasts are poor, a new model is identified, estimated and checked for adequacy. The following general steps should be followed in the selection of forecasting model.

STEP 1: Decide what variable to forecast. It is best to forecast a variable in which one has some real interest.
STEP 2: Select the sample period carefully. The sample should be large consisting of at least 50 (monthly, quarterly) observations. The identification of the true pattern in the data requires observations over many historical periods. Small sample may lead to the identification of wrong pattern and inaccurate forecasts.

STEP 3: Plot the series against time. This is valuable for more than one reason.
A time plot can reveal if there are abnormalities in the series in the form sudden “structural breaks” or outliers. If outliers represent key-board errors, they should be corrected. If they are the result of unusal events (wars, earthquakes), they should either be removed or replaced with averages of neighbouring values. Structural shifts in the data can be problematic. If theirs causes and exact timing can be identified a priori, dummy variables may be used to capture their effects in the modelling process.
A time plot can reveal if the series contains a strong seasonal pattern. If no distinct seasonal movements are discernable, no seasonal adjustment may be necessary. Is a repeated seasonal pattern is evident from the plot, the series is said to “seasonal” be deseasonalized (i.e., the seasonal pattern removed) before applying formal model fitting techniques to it.
A plot can indicate if there is trend in the series. If an upward or downward trend exists, the series is said to be mean non-stationary. Trend must be removed before applying model identification tools to the series. Simple differencing once, in the case of most economic and business time series, removes the (stochastic) trend and makes the series mean-stationary.
A plot may show whether the variance of the series change over time. If the amplitude of fluctuations of the series increase or decrease over time, the series is said to be variance non-stationary. Apply either log or log-difference transformation to make the series variance stationary.
Finally, it should be noted that stationarity of the series also requires that its autoregressive structure be constant over time. A visual examination of the plot of the autocorrelation function (ACF) of the series can quickly reveal if the series if stationary or not. An ACF of a stationary series dies out to zero within a few time periods.

STEP 4: Aim for simple models first and complex ones later, if necessary. The tool-kit for univariate forecasting methods includes both relatively simple and sophisticated models. As a general rule, start with simple models and gradually work your way up toward more complex models. In this way an orderly model search can take place, checking that models are of minimum complexity and ensuring that variables are introduced only if they contribute (significantly) to the predictive performance of the model. A well-known slogan among time specialists is “ Be parsimonious”.

STEP 5: Having removed seasonality and made the series stationary, proceed interactively as follows:

Model Identification (specification)
Model estimation (fitting)
Diagnostic checking (criticism)

Identification involves the use of such statistical tools as range-mean plots, autocorrelation function and partial autocorrelations function that help to arrive at initial guesses about the appropriate number of autoregressive and moving average terms that should be included in the model. Estimation (fitting) involves using efficient (maximum likelihood) procedures for estimating the model parameters, their standard errors and correlations, and the residual variance and covariance. Diagnostic checking involves tests looking for model inadequacies. The most important of such tests include:

The D-W and the Q-test. These should be applied to ensure that the residuals from the estimated model are randomly distributed with no systematic pattern left in them.
Over- and under-fitting tests may be performed to study model adequacy.
An analysis of the time plot of residuals should be conducted to further confirm randomness and stationarity.
STEP 6: Forecast formulation and evaluation. Model(s) estimated in step 5 is extrapolated in order to derive forecasts of the future values of the time series.
Forecast errors from each model should be calculated in order to compare its forecast accuracy. The standard deviations of the forecast errors and the associated 95% confidence intervals at one, two, three and four-step ahead forecast horizons should be calculated.

STEP 7: Use appropriate graphs and plots to present the forecast results in simple terms so that the potential user of the forecasts may easily understand the results. Decomposition of both data and forecasts into trend, seasonal and irregular components along with theirs time plots is recommended. this is the topic writer should write about(Wolfer sunspot numbers, 1770 to 1869) please be careful and it is project that i need to be done