Research Article

Monthly Rainfall Forecast over the Sahel Region Using Sarima Models

Bello E1, Aganbi B1, Gbode E2 and Aremu O2*

1 Nigeria Meteorological Agency, Nigerian Meteorological Agency, NiMet, Nigeria
2 Federal University Science and Technology, FUTA, Akure, Nigeria

Received date: April 26, 2018; Accepted date: June 05, 2018; Published date: June18, 2018

*Corresponding author: Aremu Oluwasegun, Federal University Science and Technology, FUTA, Akure, Nigeria, E-mail: segunaremu007@hotmail.com, s.aremu@nimet.gov.ng
Abstract

Monthly rainfall data spanning a period of thirty six (1981-2017) years for some stations in the Sahel region of the country (they include, Katsina Zamfara, Maiduguri, Sokoto, Yobe states) were obtained from the archive of the Nigerian meteorological Agency, NiMet. The data were then subdivided into segments (1981-2014, 1981-2015 and 1981-2016) in order to train ARIMA models for the states in the region. The models were subsequently validated and used to forecast monthly rainfall for the aforementioned states. Statistical t-test was then employed to ascertain differences between the forecast and actual rainfall data for 2015, 2016 and 2017. The results of the t-test carried out at the 1 and 5% level of significance indicated that there was no significant difference between the forecast and actual rainfall. Accordingly it is recommended that the developed arima models, which of course differ from one state to the other, could be adopted and used to generate forecast of monthly rainfall for year 2018. Although statistical t-test indicated no difference between forecast generated from the model and the corresponding actual data, the models appeared to have performed better over Gusau (in 2016 and 2017), Katsina (2016 and 2017), Potiskum ( in 2016) and Sokoto.

Keywords: Sarima model; NiMet; Climate Data; ACF; PACF; T-test; p-value and Null hypothesis;

Introduction

Nigeria’s climate has witnessed significant climate variability leading to extreme events such as the 2012 flood that dislocated socio-economic activities (Afiesimama et al, 2013). Conversely, we have also witnessed drought-situation particularly over the Sahel region where rainfall deficit has been a major source of concern and the bulk of agricultural production that drives socio-economic activities is centred. Trends have shown that climate variability (Leonard K.A mekudzi et al., 2015) can result in late onset and early cessation of the rainy season in this region. However, rigorous analysis has shown that natural climate variability alone cannot explain the long-term trend of changing extremes in temperature and precipitation (Meehl et al., 2007; Gutowoski et al, 2008; Stott et al, 2010; and Christidis et al,2011).

While many societies are taking measures to cope with historical weather extremes, new and more extreme events have the potential to overwhelm existing structures and programs put in place to mitigate their impacts (Solomon et al, 2008). The potential for these events to bring irrevocable and damaging effect to infrastructure, in addition to precipitating natural resource conflicts, necessitates the need to evolve tools to quantify and analyse their impact; especially tools or mechanisms that can account for the trend in climate and its variability. This would in turn enable effective adaptation strategies. An effective mitigation strategy is the ability to take proactive approach which is only possible through evaluation and prediction (Adefolalu DO, 2010).

The primary aim of seasonal rainfall prediction is to:
• Forewarn on imminent and repetitive extreme climate events that could lead to disasters.
• Protect lives and livelihoods in areas prone to disaster especially to that are caused by climate episodic events
• Develop and establish indices and indicators of extreme events.

A sizable amount of study has been dedicated to the science of rainfall prediction ranging from thermodynamic models of Omotosho (1999); artificial neural network by Cristian et al (2014); arima models by Etuk et al (2013); Edwin and Martins (2014) examined the stochastic characteristics of monthly rainfall in Ilorin; to the SARIMA (0, 0, 0) (1, 1, 1)12 models of Akpakta et al (2015) for monthly rainfall over Umuahia.

However, very little amount of research has been dedicated to forecasting monthly rainfall and this creates a gap especially in the face of increasing extreme weather events particularly over the Sahel. It is imperative therefore, to explore effective approaches to forecasting monthly rainfall, and Sarima models are a set of veritable tools in this regard capable of capturing or modelling trend/changes in climate; changes that could be driving increases in extreme weather or rainfall events.

Arima/Sarima Models

Arima is an acronym for autoregressive integrated moving averages. The “AR” part of the model is termed the autoregressive component, the “I” integrated and the “MA” the moving average component.

These models are used to fit time series data for the purpose of understanding the data and predicting future values in the time series (people.duke.edu). A major requirement in applying the model to time series is to ensure that the time series is stationary.

The AR part of ARIMA represents the term in the model obtained from regressing the variable of interest on itself; the MA part implies a linear combination of the lagged error terms, while the I part indicates the ordinary differencing required to make the time series stationary, although this may have to be coupled with other forms of transformation.

Stationary of a time series implies that the variance of the time series does not change significantly with time. Various transformations must be performed on a non-stationary time series to render it stationary before fitting the data set to arima model. If Dickey-Fuller test suggest that a time series is non-stationary, it can be transformed by differencing and taking logarithm of the data (Wikipedia on Dickey-Fuller test, 2018).

The models are mostly expressed in the form arima (p, d, q) where the p, d, q components are positive integers. For the seasonal data the corresponding seasonal models are given as arima(p,d,q)(P,D,Q)m, where m are the number of periods in each of the seasons, and like the p,d,q components, P,D,Q are positive integers.

Study Area

The Sahel Savannah covers the extreme northern part of Nigeria with the Sudan savannah bordering it to the south. It occupies about 18 130 km2 of the extreme northeast corner of Nigeria. The region is a semi-arid area covering Maiduguri, Yobe, Parts of Kebbi state, Zamfara, and Sokoto states. It is a home to a large population of animals and is bedevilled by issues of extreme poverty, climate change, armed conflict and insecurity. It is characterized by short rainy season with between 380-930mm of rainfall per annum and last for about three to four months (June- September) and a peak in August. There is high inter and intra seasonal variability in rainfall around the region. It is hot, sunny, dry and windy all year round with intense heat with temperature ranging between 36 and 42oC in the hot season. The main characteristics of this area are its desert nature and short grasses. The region is a major player in agricultural production and is also the worse hit by weather fluctuations. This zone is characterized by plants such as Cenchrusbiflorus, and Acacia raddiana. The shrubs that are predominantly scattered in the zone are African myrrh (Commiphoraafricana) and Leptadeniaspartum. Irrigation farming is a common practice owing to short duration of rainfall, with the area dominated by herdsmen. Common crops grown include: maize, sorghum, cowpea, rice, amongst others.

It is important to state here that the region is a largely agrarian one and is the centre of the bulk of rain fed agro-related activities in Nigeria. Ironically, the area has suffered from the vagaries of extreme weather events that has had untold impact on the agricultural sector. Mainly because of the short length of the growing season, intense heat, over grazing, conflict, population explosion that have resulted in desertification. Hence forecast of rainfall, particularly on the monthly scale, is of invaluable significance.

Data and Methodology

Data

Data for the research consisting of monthly rainfall for all the stations in the Sahel region, spanning a time interval of at least thirty years (1981- 2017), were obtained from the archive of the Nigerian Meteorological Agency, NiMet. These were then segmented into parts viz; from 1981-2014 for training the model and forecasting 2015 rainfall and then subsequently validating it with the actual rainfall data of 2015; from 1981-2015 for training the model and forecasting 2016 rainfall and then subsequently validating it with the actual rainfall data of 2016; from 1981-2016, for training the model and forecasting 2017 rainfall and validating with it the actual rainfall data of 2017; and finally from 1981-2017, for training the model and forecasting the monthly rainfall for 2018. This was done for all the stations in the Sahel.

Methodology

As have already been stated, the model used in the research is a stochastic model called ARIMA, an acronym for Autoregressive Integrated Moving Averages. Although a detailed literature on the subject matter of ARIMA is not given in this study, a brief definition of its associated basic and essential components outlined in this study, would suffice for the purpose of this research. Robert Nau (people.duke.edu, 2018), Wikipedia (on the rules on identifying the orders of AR and MA terms of arima model, 2018), Cross Validated (on sarima Models, 2018), Dickey Fuller Test (2018) and Statistica (on identifying patterns in a time series data, 2018) amongst others are studies that provides detailed insight into the subject matter of sarima models

General Form/Equation of Arima

The sarima model is generally expressed as SARIMA(p,d,q)(P,D,Q)m This indeed serves as a solution to the general form of the multiplicative arima model given by
Φ( B m )ϕ( B ) D m d Y t =Θ( B m )θ( B ) ε t , MathType@MTEF@5@5@+= feaaheart1ev3aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacqqHMoGrpaWaaeWaaeaapeGaamOqa8aadaahaaWcbeqaa8qacaWG TbaaaaGcpaGaayjkaiaawMcaaiabew9aMnaabmaabaWdbiaadkeaa8 aacaGLOaGaayzkaaWdbiabgEGir=aadaahaaWcbeqaa8qacaWGebaa aOWdamaaBaaaleaapeGaamyBaaWdaeqaaOWdbiabgEGir=aadaWgaa WcbaWdbiaadsgaa8aabeaak8qacaWGzbWdamaaBaaaleaapeGaamiD aaWdaeqaaOWdbiabg2da9iabfI5ar9aadaqadaqaa8qacaWGcbWdam aaCaaaleqabaWdbiaad2gaaaaak8aacaGLOaGaayzkaaWdbiabeI7a X9aadaqadaqaa8qacaWGcbaapaGaayjkaiaawMcaaiabew7aLnaaBa aaleaacaWG0baabeaakiaacYcaaaa@56DD@

With ԑt as the white noise process, the terms of this function/model can be further expressed as follows:
m Y t = Y t Y t m, Y t = Y t Y t 1             (1)     MathType@MTEF@5@5@+= feaaheart1ev3aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacqGHhis0paWaaSbaaSqaa8qacaWGTbaapaqabaGcpeGaamywa8aa daWgaaWcbaWdbiaadshaa8aabeaak8qacqGH9aqpcaWGzbWdamaaBa aaleaapeGaamiDaaWdaeqaaOWdbiabgkHiTiaadMfapaWaaSbaaSqa a8qacaWG0baapaqabaGcpeGaeyOeI0IaamyBaiaacYcacqGHhis0ca WGzbWdamaaBaaaleaapeGaamiDaaWdaeqaaOWdbiabg2da9iaadMfa paWaaSbaaSqaa8qacaWG0baapaqabaGcpeGaeyOeI0Iaamywa8aada WgaaWcbaWdbiaadshaa8aabeaakmaaBaaaleaacqGHsislcaaIXaaa beaakiaabccacaqGGaGaaeiiaiaabccacaqGGaGaaeiiaiaabccaca qGGaGaaeiiaiaabccacaqGGaGaaeiiaiaabIcacaqGXaGaaeykaiaa bccacaqGGaGaaeiiaiaabccaaaa@5CC2@
  Φ( B m )=1 Φ 1 B m . Φ P B Pm          (2)           MathType@MTEF@5@5@+= feaaheart1ev3aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaGGGcGaaiiOaiaabA6adaqadaWdaeaapeGaamOqa8aadaahaaWc beqaa8qacaWGTbaaaaGccaGLOaGaayzkaaGaeyypa0JaaGymaiabgk HiTiaabA6apaWaaSbaaSqaa8qacaaIXaaapaqabaGcpeGaamOqa8aa daahaaWcbeqaa8qacaWGTbaaaOGaeyOeI0IaeyOjGWRaeyOeI0Iaai OlaiaabA6apaWaaSbaaSqaa8qacaWGqbaapaqabaGcpeGaamOqa8aa daahaaWcbeqaa8qacaWGqbGaamyBaaaak8aacaqGGaGaaeiiaiaabc cacaqGGaGaaeiiaiaabccacaqGGaGaaeiiaiaabccacaqGOaGaaeOm aiaabMcacaqGGaGaaeiiaiaabccacaqGGaGaaeiiaiaabccacaqGGa GaaeiiaiaabccacaqGGaaaaa@5BFA@
ϕ(B)=1 ϕ 1 B ϕ 2 B 2 n ϕ P B p          (3)           MathType@MTEF@5@5@+= feaaheart1ev3aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacqaHvpGzcaGGOaGaamOqaiaacMcacqGH9aqpcaaIXaGaeyOeI0Ia eqy1dy2damaaBaaaleaapeGaaGymaaWdaeqaaOWdbiaadkeacqGHsi slcqaHvpGzpaWaaSbaaSqaa8qacaaIYaaapaqabaGcpeGaamOqa8aa daahaaWcbeqaa8qacaaIYaaaaOGaeyOeI0YdaiaGb6gapeGaeyOeI0 Iaeqy1dy2damaaBaaaleaapeGaamiuaaWdaeqaaOWdbiaadkeapaWa aWbaaSqabeaapeGaamiCaaaak8aacaqGGaGaaeiiaiaabccacaqGGa GaaeiiaiaabccacaqGGaGaaeiiaiaabccacaqGOaGaae4maiaabMca caqGGaGaaeiiaiaabccacaqGGaGaaeiiaiaabccacaqGGaGaaeiiai aabccacaqGGaaaaa@5CE4@
Θ( B m )=1+ Θ 1 B m ++ Θ Q B Qm          (4)           MathType@MTEF@5@5@+= feaaheart1ev3aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacqqHyoqudaqadaWdaeaapeGaamOqa8aadaahaaWcbeqaa8qacaWG TbaaaaGccaGLOaGaayzkaaGaeyypa0JaaGymaiabgUcaRiabfI5ar9 aadaWgaaWcbaWdbiaaigdaa8aabeaak8qacaWGcbWdamaaCaaaleqa baWdbiaad2gaaaGccqGHRaWkcqGHMacVcqGHRaWkcqqHyoqupaWaaS baaSqaa8qacaWGrbaapaqabaGcpeGaamOqa8aadaahaaWcbeqaa8qa caWGrbGaamyBaaaak8aacaqGGaGaaeiiaiaabccacaqGGaGaaeiiai aabccacaqGGaGaaeiiaiaabccacaqGOaGaaeinaiaabMcacaqGGaGa aeiiaiaabccacaqGGaGaaeiiaiaabccacaqGGaGaaeiiaiaabccaca qGGaaaaa@59C4@
θ( B )=1+ θ 1 B++ θ q B q         (4)           MathType@MTEF@5@5@+= feaaheart1ev3aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacqaH4oqCdaqadaWdaeaapeGaamOqaaGaayjkaiaawMcaaiabg2da 9iaaigdacqGHRaWkcqaH4oqCpaWaaSbaaSqaa8qacaaIXaaapaqaba GcpeGaamOqaiabgUcaRiabgAci8kabgUcaRiabgAci8kabgAci8kab gAci8kabgAci8kabeI7aX9aadaWgaaWcbaWdbiaadghaa8aabeaak8 qacaWGcbWdamaaCaaaleqabaWdbiaadghaaaGcpaGaaeiiaiaabcca caqGGaGaaeiiaiaabccacaqGGaGaaeiiaiaabccacaqGOaGaaeinai aabMcacaqGGaGaaeiiaiaabccacaqGGaGaaeiiaiaabccacaqGGaGa aeiiaiaabccacaqGGaaaaa@5CD4@

Autocorrelation and Partial Autocorrelation

Most time series patterns can be described in terms of two basic classes of components: trend and seasonality. These two general classes of time series components may coexist in real-life data. The former represents a general systematic linear or (most often) nonlinear component that changes over time and does not repeat within the time range captured by the data set. The latter, however, repeats itself in systematic intervals over time indicating seasonal dependencies that can be measured by Autocorrelation function. Autocorrelation Function (ACF) is the correlation between a time series and lagged version of itself or a display of serial dependencies or coefficients of the time series variable at various lags, while Partial Autocorrelation Function (PACF) is the amount of correlation between the time series and a lagged version of itself that is not explained by lower order lags. For instance if we are regressing a time series variable Y against x1, x2 and x3, the partial autocorrelation between Y and x3 is the amount of correlation between Y and x3 that is not explained by their common correlations with x1 and x2.

Identifying the Orders of Arima Models from Acf and Pacf

A brief approach that was employed to identify the suitable terms of the arima model in this research would be given in this section. Detail rules for identifying the terms of Sarima models can be found in (people. duke.edu).

There is a systematic method of identifying the orders of arima models. By looking at the ACF and PACF one can tentatively identify the number of AR and/ or MA terms that are needed. For instance if a time series has positive autocorrelations out to a high number of order lags (say 12 or more), then the time series would require a higher order of differencing to de-trend it i.e. seasonal differencing. By mere inspection, one can determine the number of AR terms needed to explain the autocorrelation in a time series. Generally, the orders of AR terms are derived from the PACF plot while the orders of the MA terms are obtained from the ACF.

If the PACF displays a sharp “cut-off ”, while the ACF “decays” more slowly, the stationarize time series displays an “AR” signature. On the other hand if the ACF displays a sharp cut-off while the PACF decays more slowly, the time series displays an MA signature.

In the former case consider adding an AR term to the model which is equivalent to taking a first order differencing. If the cut-off in the PACF plot occurs at lag p, this indicates that exactly p AR terms should be added to the model. On the other hand if the ACF cut-off occurs at lag q, then consider adding q MA terms to the model. MA terms are commonly associated with time series that are slightly over-differenced.

The seasonal part of an Arima model has the same structure as the non-seasonal part. In identifying seasonal model, the first step is to determine whether or not a seasonal differencing is required in addition to or non-seasonal differencing. It is however instructive that using more than one or two order of differencing should be avoided for both types of differencing combined.

If the time series has a strong consistent seasonal pattern (this can be detected in the time series plot or ACF and PACF) it would be reasonable to take an order of differencing.

The signature SAR/MAR behaviour is similar to those of pure AR/MA except that the pattern occurs across multiples of lags in the PACF and ACF. A pure SAR (1) has spikes in the ACF at lags m, 2m, 3m and so on, while the PACF cuts off at lag m. Conversely a pure SMA(1) process has spikes in the PACF at lags m, 2m, 3m and so on while ACF cuts off at lag m.

These are some of the rules considered while designing the arima model for our time series data.

The observation of non-stationarity quality in most of the time series necessitated the need to transform the time series using natural logarithm (“ln”). Although this was done only after adding one to the data in order to avoid errors that may result from taking the natural logarithms of zeros in the data set; and our data set contain a lot of zeroes.

The tentative observation of seasonal patterns in the time series and the ACF and PACF plots necessitated an order of seasonal differencing. This was able to help de-trend the data whose ACF and PACF plots clearly revealed the orders of AR, SAR, MA and SMA terms required to determine a parsimonious model. This approach was effective in identifying the seasonal order in the time series as well as determining a tentative model.

The resulting residuals from the tentative model informed its further adjustment until the parsimonious model was determined. The residuals here are ACF, p-value and Q-Q plots of residuals from the tentative model which can be adjusted or accepted as the suitable model for the data depending on the plots.

A null hypothesis was considered in the case of the p-value of the residuals which is that “the ACF of the residuals at various lags is not significant or zero”; in other words no autocorrelations are present in the residuals from the model. Expectedly a high p-value would indicate that the arima model was able to extract all the autocorrelations in the data and minimized the error between the observed and simulated data and thus resulting in residuals with insignificant autocorrelations.

The p-value in simple terms, is the probability, given the null hypothesis as in above, of obtaining autocorrelations in residuals equal to the one observed or more extreme than what was observed. Accordingly, a high p-value of the residuals would imply that the null hypothesis should be accepted.

The associated Quantile-Quantile plots indicate the degree to which the residuals are normally distributed when the data tends to align to a straight line.

Therefore if a parsimonious arima model is to be determined, the ensuing residuals should be pattern-less or random with insignificant autocorrelations at the various lags; a behaviour which can only be detected by the Q-Q plots and autocorrelation functions respectively.

Thus in the course of developing the suitable arima model for our data, the time series was subjected to this statistical tests whose result indicated that the residuals are pattern-less white-noise with insignificant autocorrelations at the various lags.

As earlier stated the research would include forecast of monthly rainfall and validation of the model. This was done by using the monthly rainfall data from 1981-2014 as training data to forecast monthly rainfall for 2015 and subsequently validating with actual data of 2015; using the monthly rainfall data from 1981-2015 as training data to forecast the rainfall in 2016 and subsequently validating with actual data of 2016; and using monthly data from 1981-2016 to forecast rainfall for 2017 and subsequently validating with actual data of 2017. Thereafter the forecast rainfall was compared with the observed rainfall using the t-test distribution.

Determining Significant Difference Between Simulated and the Observed Data

The student’s t-test (Wikipedia, 2018) is generally used to test the differences between the population means of two distributions. This is often used in cases where the populations are believed to have nearly equal standard deviations, near-normally distributed and the drawn samples for the test is small i.e. n< 30. For instance given two samples with n1, x1 and s1 and n2, x2 and s2 as sample size, mean and standard deviation respectively, the common standard deviation
S p = ( n11 )s 1 2 +( n21 ) S 2 ( n1+n22 ) MathType@MTEF@5@5@+= feaaheart1ev3aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWGtbWdamaaBaaaleaapeGaamiCaaWdaeqaaOWdbiabg2da9maa kaaapaqaa8qadaWcaaWdaeaapeWaaeWaa8aabaWdbiaad6gacaaIXa GaeyOeI0IaaGymaaGaayjkaiaawMcaaiaadohacaaIXaWdamaaCaaa leqabaWdbiaaikdaaaGccqGHRaWkdaqadaWdaeaapeGaamOBaiaaik dacqGHsislcaaIXaaacaGLOaGaayzkaaGaam4ua8aadaahaaWcbeqa a8qacaaIYaaaaaGcpaqaa8qadaqadaWdaeaapeGaamOBaiaaigdacq GHRaWkcaWGUbGaaGOmaiabgkHiTiaaikdaaiaawIcacaGLPaaaaaaa leqaaaaa@50FE@
And so the standard errors of each of the samples is S x1 = S p n1 , S x2 = S p n2 , MathType@MTEF@5@5@+= feaaheart1ev3aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWGtbWdamaaBaaaleaapeGaamiEaiaaigdaa8aabeaakiabg2da 98qadaWcaaWdaeaapeGaam4ua8aadaWgaaWcbaWdbiaadchaa8aabe aaaOqaa8qadaGcaaWdaeaapeGaamOBaiaaigdaaSqabaaaaOGaaiil aiaadofapaWaaSbaaSqaa8qacaWG4bGaaGOmaaWdaeqaaOGaeyypa0 Zdbmaalaaapaqaa8qacaWGtbWdamaaBaaaleaapeGaamiCaaWdaeqa aaGcbaWdbmaakaaapaqaa8qacaWGUbGaaGOmaaWcbeaaaaGccaGGSa aaaa@484B@

and so the sampling error of the

Distribution is
S ( x1x2 ) = S x1 2 + S x2 2 MathType@MTEF@5@5@+= feaaheart1ev3aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWGtbWdamaaBaaaleaapeWaaeWaa8aabaWdbiaadIhacaaIXaGa eyOeI0IaamiEaiaaikdaaiaawIcacaGLPaaaa8aabeaakiabg2da98 qadaGcaaWdaeaapeGaam4ua8aadaWgaaWcbaWdbiaadIhacaaIXaaa paqabaGcdaahaaWcbeqaa8qacaaIYaaaaOGaey4kaSIaam4ua8aada WgaaWcbaWdbiaadIhacaaIYaaapaqabaGcdaahaaWcbeqaa8qacaaI Yaaaaaqabaaaaa@475F@ Thus the t-score is given by
tscore= x1 x2 S ( x1x2 ) MathType@MTEF@5@5@+= feaaheart1ev3aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaaeaaaaaaaaa8 qacaWG0bGaeyOeI0Iaam4CaiaadogacaWGVbGaamOCaiaadwgacqGH 9aqpdaWcaaWdaeaadaWfGaqaa8qacaWG4bGaaGymaaWcpaqabeaacq GHsislaaGcpeGaeyOeI0YdamaaxacabaWdbiaadIhacaaIYaaal8aa beqaaiabgkHiTaaaaOqaa8qacaWGtbWdamaaBaaaleaapeWaaeWaa8 aabaWdbiaadIhacaaIXaGaeyOeI0IaamiEaiaaikdaaiaawIcacaGL Paaaa8aabeaaaaaaaa@4C53@
with n1+n2-2 degrees of freedoms

The calculated t-score is then compared to the t-score with n1+n2- 2 degrees of freedoms obtained from t-test tables at the 5% level of significance.

Thus with as the mean monthly observed rainfall of a given year and as corresponding mean of monthly rainfall of the forecast, the test was used to ascertain whether there were significant differences between the actual and forecast rainfall.

Discussion of Results

The results (figures 1-5 and tables 1 and 2) of the model validation based on the monthly rainfall forecast for Potiskum, Sokoto, Maiduguri, Katsina and Gusau in 2015, 2016 are presented in this section.

Generally, the results of the model validation i.e. comparison between the forecast and actual rainfall, are reasonable since in most of the cases there was no significant differences between them, based on the statistical t-test carried out on the result at a 1 percent level of significance.

Over Potiskum the forecast (Figure 1a) compared favourably well with the observed in 2016. (Figures 1-5) (Tables 1-3)

The reliability of the sarima model (See Table1) designed and found to be the most parsimonious based on the AIC of the residual (of logged residual) plot (Table 1), is backed by 99 percent confidence level otherwise known as the one percent level of significance. At this level of significance, the difference between the forecast and observed was subjected to t-test which showed that there was no significant difference (no sig. dif.) between the forecast and actual rainfall in 2016 (Table 3a) as the calculated t-value was less than the critical t-value (2.09; this is the critical value of t at the 1% level of significance and at 20 degrees of freedom, used in all the associated t-test of all the forecast). The standard error (Table 3a) between the forecast and actual rainfall is relatively low.

The plots of the standardized residuals (Figure 6a) in this case shows that most of the autocorrelations in the data used in the 2016 forecast have been removed from the data by the sarima model, sarima (0,0,7) (1,1,3)12 as buttressed by the corresponding p-values (Recall that the

Figure 1: Time series of monthly rainfall forecast(red line) and actual monthly (blue line) rainfall in (a) 2016, (b) 2017 over Potiskum
Figure 2: As in Figure 1 but over Maiduguri
Figure 3: As in Figure 1 but over Sokoto
Figure 4: As in Figure 1 but over Katsina
Figure 5: As in Figure 1 but over Gusa
Table 1: Terms of the arima model (sarima (p,d,q)(P,D,Q)12) used in generating 2016 and 2017 forecast

 

 

2016

 

2017

Potiskum

Sarima(0,0,5)(1,1,3)12

Sarima(0,0,5)(1,1,3)12

Maiduguri

Sarima(0,0,6)(3,1,1)12

Sarima(0,0,1)(3,1,1)12

Sokoto

Sarima(0,0,1)(3,1,1)12

Sarima(0,0,1)(3,1,1)12

Katsina

Sarima(0,0,7)(3,1,1)12

Sarima(0,0,7)(3,1,1)12

Gusau

Sarima(0,0,7)(3,1,1)12

Sarima(0,0,7)(3,1,1)12

null hypothesis of the lags of the lags associate with the residuals is zero). This reduces the standardized residuals to white noise as reinforced by the patterns of near randomness in the Q-Q plots (section 3.5).

The forecast (Figure 1b) over Potiskum for 2017 is fairly or slightly deviated from the observed. However the difference between them is not significant (Table 3b).

Similarly the difference between the forecast and observed seems relatively higher over Potiskum and Sokoto in 2017 (Table 3b, Figure 1b and Figure 3b). However, the respective indicators (Figure 6b and Figure 8b) quantifying the overall differences indicated that the differences are not significant at the one percent level of significance.

The forecast over Maiduguri in 2017, matched relatively well with the actual rainfall with no significant difference at the 1% level (Table 3b). The sarima best-fit model here, (0, 0, 1) (3, 1, 1)12, which produced the forecast showed most of the autocorrelations in the data have been extracted as confirmed by the associated standardized white-noised residuals (Figure 7b). This is also consistent with the patterns observed in the residual ACF plots, QQ and p-value plots.

Table 2a-b: Forecast and observed rainfall over some states in the Sahel for 2016(a) and 2017(b)

 

potiskum

maiduguri

sokoto

katsina

gusau

 

observed

forecast

observed

forecast

observed

forecast

observed

forecast

observed

forecast

Jan

0.0

0.0

0.0

0.0

0.0

0.0

0.0

0.0

0.3

0.0

Feb

0.0

0.0

0.0

0.0

0.0

0.0

0.0

0.0

0.0

0.0

Mar

0.0

0.0

0.0

0.0

0.0

0.0

13.9

0.0

17.1

0.0

Apr

1.4

1.4

11.9

1.9

1.0

4.1

20.1

6.8

64.5

6.8

May

30.8

18.0

78.4

18.7

151.8

53.3

131.2

77.9

89.3

79.6

Jun

71.8

84.2

79.1

62.0

171.2

91.5

149.7

110.5

200.7

110.5

Jul

188.6

152.9

68.3

182.6

280.1

212.1

215.3

180.1

283.1

184.6

Aug

251.4

225.8

306.8

215.4

180.7

235.7

317.2

284.3

310.6

282.0

Sep

123.4

90.7

196.5

117.9

175.1

128.5

76.3

152.8

265.0

155.7

Oct

28.9

15.7

0.0

10.1

0.0

12.3

0.0

23.8

4.6

23.3

Nov

0.0

0.0

0.0

0.0

0.0

0.0

0.0

0.0

0.0

0.0

Dec

696.26

0

740.975

0

959.875

0

0.0

0

0

0



 

      potiskum

    maiduguri

       sokoto

       katsina

     gusau

 

observed

forecast

observed

forecast

observed

forecast

observed

forecast

observed

forecast

Jan

0.0

0.0

0.0

0.0

0.0

0.0

0.0

0.0

0.0

0.0

Feb

0.0

0.0

0.0

0.0

0.0

0.0

0.0

0.0

0.0

0.0

Mar

0.0

0.0

0.0

0.0

0.0

0.0

0.0

0.0

0.0

0.0

Apr

0.0

4.0

0.0

1.3

0.0

4.3

0.0

1.1

53.0

5.1

May

57.8

27.1

52.4

22.4

25.4

61.5

9.1

13.1

83.2

84.9

Jun

99.7

73.6

135.9

110.5

67.6

95.4

110.5

73.1

200.7

118.7

Jul

129.3

184.2

227.8

184.6

125.5

209.8

143.2

166.3

168.5

203.5

Aug

161.7

251.8

236.9

282.0

125.4

244.2

186.1

208.0

223.3

298.6

Sep

48.2

119.8

59.2

155.7

94.0

125.0

195.3

81.4

220.3

176.3

Oct

0.0

18.6

0.0

23.3

2.5

6.6

0.0

7.1

14.0

19.8

Nov

0.0

0.0

0.0

0.0

0.0

0.0

0.0

0.0

0.0

0.0

Dec

0

0

0

0

0

0

0

0

0.0

0.0



Table 3a-b: Indicators of Differences between Observed and Forecast Rainfall for 2016 (A) And 2017(B)

 

AIC

Calculated t-value

DIFFERENCE OBS &FCST

STANDARD ERROR

POTISKUM

0.3123424

0.292

No sig. dif

13.2

MAIDUGURI

0.4325639

0.326

No sig. dif

39.7

SOKOTO

0.542171

0.511

No sig. dif

36.1

KATSINA

0.5189678

0.192

No sig. dif

26.0

GUSAU

0.5185077

0.780

No sig. dif

41.6



 

AIC

Calculated t-value

DIFFERENCE OBS &FCST

STANDARD ERROR

POTISKUM

0.2866299

-0.54

No sig. dif

30.2

MAIDUGURI

0.4475341

-0.16

No sig. dif

27.7

SOKOTO

0.5383566

-0.93

No sig. dif

34.8

KATSINA

0.5383566

0.27

No sig. dif

27.8

GUSAU

0.2669741

0.12

No sig. dif

29.9

Figure 6: Standardized residuals from the Sarima model, sarima(p,d,q) (P,D,Q) over Potiskum in (a) 2016, (b) 2017. The set of diagrams show the standarzized residuals from the various sarima models, Q-Q plots of the residuals, ACF and p-value of the lags
Figure 7: As in Figure 6 but over maiduguri
Figure 8: As in Figure 6 but over sokoto
Figure 9: As in Figure 6 but over katsina
Figure 10: As in Figure 6 but over Gasau

High p-values associated with autocorrelation plots, usually indicates that most of the autocorrelations in the data that produced the forecast has been extracted, and the approximately straight line pattern in the QQ plot is indicative of near-randomly or normal-distributed residuals. Overall, there was no significant difference at the 1% level, between the actual and forecast rainfall.

Similarly, over Katsina in 2016 and 2017, the model, arima(0,0,7) (3,1,1)12 forecast rainfall (Figure 4b, Table 2a and b) which approximately matched with the actual rainfall, generating randomly-distributed whitenoised residuals (Figure 9a and b) including high p value plots indicating that most of the autocorrelations in the data that produced the forecast have been extracted. This is also consistent with the near white-noised residuals in the QQ plot.

Over Gusau, in 2016 and 2017, the forecast (Figure 5a and b) matched with the actual with no significant difference at the 1% level (table 3a and b). Most of the autocorrelations have been extracted with the best-fit sarima model (0,0,7)(3,1,1)12. The results obtained are consistent with the observed patterns in the ACF, QQ and p-value plots (9a and b) showing that the standardized residuals have been reduced to white noise.

Finally taking into account the standard errors of the predictions (Table 3a-b), we recommend that the Sarima models as outlined in table 1, used in producing the forecast for 2017, could be employed to generate monthly rainfall forecast for 2018 for the aforementioned locations within the Sahel.

References
  • Afiesimama EA, Ukeje JE, Olaniyan EA. Observed changing climate and extreme events in Nigeria: Challenges Opportunites and policy implications. Nigerian Meteorological Society. 2013;136-140.
  • Meehl, G. A., T. F. Stocker, W.D. Collins, P. Friedlingstein, A.T. Gaye, J.M. Gregory, A. Kitoh, R. Knutti, J.M. Murphy, A. Noda, S.C.B. Raper, I.G. Watterson, A.J. Weaver, and Z. –C. Zhao, 2007: Global Climate Projections. In: Climate Change 2007: The Physical Science Basis. Contribution of Working Group I to the Fourth Assessment Report of the Intergovernmental Panel on Climate Change [Solomon, S., D. Qin, M. Manning, Z. Chen, M. Marquis, K.B. Averyt, M. Tignor and H.L. Miller (eds.)]. Cambridge University Press, Cambridge, United Kingdom and New York, NY, USA, 747–784.
  • Adefolalu DO. Climate change, Impacts and Adaptation: Role of Seasonal Climate Prediction. A special Book on Climate Change. Nigerian Meteorological Society. 2010;53-87.
  • Christidis, N., P. A. Stott, and S. Brown. The role of human activity in the recent warming of extremely warm daytime temperatures. J. Climate, 2011;24:1922–1930. 
  • Leonard K.A mekudzi, KwasiPreko, ErnestO.Asar, JeffreyAryee, MichaelBaidu, Samuel N.A. Codjo 2015: Variabilities in rainfall onset cessation and length of the raining season for various agro-ecological zones in Ghana. Climate. 2015;3:416-434
  • Stott, P. A., Gillett, N. P., Hegerl, G. C., Karoly, D. J., Stone, D. A., Zhang, X. and Zwiers, F. Detection and attribution of climate change: a regional perspective. Wiley Interdisciplinary Reviews: Climate Change. 2010;1:192–211. doi: 10.1002/wcc.34
  • Stott, Peter. Climate change: how to play our hand? There have always been extremes of weather around the world but evidence suggests human influence is changing the odds. The Guardian. August 9, 2010.
  • Gutowski, W.J., G.C. Hegerl, G.J. Holland, T.R. Knutson, L.O. Mearns, R.J. Stouffer, P.J. Webster, M.F. Wehner, F.W. Zwiers, 2008: Causes of Observed Changes in Extremes and Projections of Future Changes in Weather and Climate Extremes in a Changing Climate. Regions of Focus: North America, Hawaii, Caribbean, and U.S. Pacific Islands. T.R. Karl, G.A. Meehl, C.D. Miller, S.J. Hassol, A.M. Waple, and W.L. Murray (eds.). A Report by the U.S. Climate Change Science Program and the Subcommittee on Global Change Research, Washington, DC.
  • Solomon, S., D. Qin, M. Manning, R.B. Alley, T. Berntsen, N.L. Bindoff, Z. Chen, A. Chidthaisong, J.M. Gregory, G.C. Hegerl, M. Heimann, B. Hewitson, B.J. Hoskins, F. Joos, J. Jouzel, V. Kattsov, U. Lohmann, T. Matsuno, M. Molina, N. Nicholls, J. Overpeck, G. Raga, V. Ramaswamy, J. Ren, M. Rusticucci, R. Somerville, T.F. Stocker, P. Whetton, R.A. Wood and D. Wratt, 2007: Technical Summary. In: Climate Change 2007: The Physical Science Basis. Contribution of Working Group I to the Fourth Assessment Report of the Intergovernmental Panel on Climate Change [Solomon, S., D. Qin, M. Manning, Z. Chen, M. Marquis, K.B. Averyt, M. Tignor and H.L. Miller (eds.)]. Cambridge University Press, Cambridge, United Kingdom and New York, NY, USA.
  • Wikipedia :Dicky-fuller test (last edited on 11 April 2018)