PAR(p) Autoregressive Models

What Is a PAR(p) Model?

A Periodic Autoregressive model of order p (PAR(p)) is a time series model designed for data with strong seasonal patterns. It extends the classical autoregressive (AR) model by allowing every parameter to vary by season — the coefficients that govern January inflows are different from those that govern July inflows.

The “order p” indicates how many past time steps the model looks back. A PAR(3) model for a given month predicts the current inflow using the inflows from the previous three months. The order can differ by season: January might need only one lag while April might need four, reflecting different hydrological dynamics across the year.

The PAR(p) Equation

For hydro plant $h$ at stage $t$ falling in season $m (t)$ , the PAR(p) model is:

$a_{h, t} = μ_{m (t)} + ℓ = 1 \sum p ψ_{m (t), ℓ} (a_{h, t - ℓ} - μ_{m (t - ℓ)}) + σ_{m (t)} \cdot ε_{t}$

In words: the inflow at stage $t$ equals the seasonal mean, plus a weighted combination of how much recent inflows deviated from their seasonal means, plus a random shock.

Parameters by Season

Each season $m$ has its own set of parameters:

Parameter	Symbol	Role
Seasonal mean	$μ_{m}$	Expected inflow for season $m$
AR coefficients	$ψ_{m, 1}, \dots, ψ_{m, p}$	Weights on past deviations from the mean
Residual std	$σ_{m}$	Scale of the random innovation
Innovation	$ε_{t} \sim N (0, 1)$	Standardized random shock

The seasonal mean $μ_{m}$ and sample standard deviation $s_{m}$ are estimated from historical data. The AR coefficients $ψ_{m, ℓ}$ are fitted using the Yule-Walker equations (see below). The residual standard deviation $σ_{m}$ is derived at runtime from the other parameters (it is not stored independently).

How Lags Become State Variables

In the SDDP framework, decisions at each stage depend on a set of state variables that summarize everything the optimizer needs to know from the past. For the PAR(p) model, the state variables are the lagged inflows:

$State at stage t : (v_{h, t}, a_{h, t - 1}, a_{h, t - 2}, \dots, a_{h, t - p_{m a x}})$

where $v_{h, t}$ is the reservoir volume and $a_{h, t - ℓ}$ are the lagged inflows needed by the autoregressive equation. Each lag adds one state variable per hydro plant to the SDDP subproblem.

This is significant for problem size: a system with 150 hydro plants and a maximum PAR order of 6 adds up to $150 \times 6 = 900$ state variables beyond the reservoir volumes. The LP formulation includes constraints that “shift” lagged inflows forward from one stage to the next, ensuring the autoregressive structure is respected across the Bellman recursion.

Stored vs. Computed Quantities

Cobre stores the natural outputs of the fitting process:

Stored: seasonal means ( $μ_{m}$ ), seasonal sample standard deviations ( $s_{m}$ ), AR order ( $p_{m}$ ), and AR coefficients in original units ( $ψ_{m, ℓ}$ )
Computed at runtime: the residual standard deviation $σ_{m}$ , derived from the stored quantities to guarantee consistency

This design avoids redundancy — $σ_{m}$ is fully determined by the other parameters and recomputing it is inexpensive.

Yule-Walker Fitting Procedure

When fitting PAR(p) parameters from historical inflow data, the AR coefficients are estimated by solving the Yule-Walker equations — a linear system that relates the autocorrelations of the data to the model coefficients. The procedure has five steps.

Step 1 — Seasonal Statistics

For each season $m$ , compute the sample mean and standard deviation from historical observations ${a_{h, t} : m (t) = m}$ :

$\overset{μ}{^}_{m} = \frac{1}{N _{m}} t : m (t) = m \sum a_{h, t}$

$\overset{s}{^}_{m} = \frac{1}{N _{m} - 1} t : m (t) = m \sum (a_{h, t} - \overset{μ}{^}_{m})^{2}$

where $N_{m}$ is the number of historical observations for season $m$ .

Step 2 — Seasonal Autocorrelations

Compute the cross-seasonal autocorrelation at lag $ℓ$ for season $m$ . The cross-seasonal structure arises because lag $ℓ$ at season $m$ reaches back to season $m - ℓ$ (cyclically):

$\overset{γ}{^}_{m} (ℓ) = \frac{1}{N _{m} - 1} t : m (t) = m \sum (a_{h, t} - \overset{μ}{^}_{m}) (a_{h, t - ℓ} - \overset{μ}{^}_{m - ℓ})$

$\overset{ρ}{^}_{m} (ℓ) = \frac{γ ^ _{m} ( ℓ )}{s ^ _{m} \cdot s ^ _{m - ℓ}}$

Note that $\overset{s}{^}_{m - ℓ}$ is the standard deviation of season $m - ℓ$ , not of season $m$ . This is the defining feature of a periodic (as opposed to stationary) autoregressive model.

Step 3 — Yule-Walker System

For each season $m$ , the coefficients in standardized form $ψ_{m, 1}^{*}, \dots, ψ_{m, p}^{*}$ satisfy:

$R_{m} ψ_{m}^{*} = r_{m}$

where:

$R_{m} = 1 \overset{ρ}{^}_{m} (1) ⋮ \overset{ρ}{^}_{m} (p - 1) \overset{ρ}{^}_{m - 1} (1) 1 ⋮ \overset{ρ}{^}_{m - 1} (p - 2) \dots \dots ⋱ \dots \overset{ρ}{^}_{m - p + 1} (p - 1) \overset{ρ}{^}_{m - p + 2} (p - 2) ⋮ 1, r_{m} = \overset{ρ}{^}_{m} (1) \overset{ρ}{^}_{m} (2) ⋮ \overset{ρ}{^}_{m} (p)$

The solution is:

$\hat{ψ}_{m}^{*} = R_{m}^{- 1} r_{m}$

The matrix $R_{m}$ is not a standard Toeplitz matrix (because consecutive rows use different seasons’ correlations), but it has a similar structure. The correlation matrix must be positive definite for the solution to exist; if not, the historical record may be too short for the requested order.

Step 4 — Residual Standard Deviation

After solving the Yule-Walker system, the residual standard deviation for season $m$ is:

$\overset{σ}{^}_{m} = \overset{s}{^}_{m} 1 - ψ_{m}^{* ⊤} r_{m}$

This equals $\overset{s}{^}_{m}$ times the square root of the unexplained variance fraction. If $\overset{σ}{^}_{m}^{2} \leq 0$ , the model overfits — it explains all historical variance, leaving no room for the noise term.

Step 5 — Convert to Original Units

The Yule-Walker solution yields coefficients in standardized form $ψ_{m, ℓ}^{*}$ (dimensionless, relating standardized deviations). The LP requires original-unit coefficients:

$ψ_{m, ℓ} = ψ_{m, ℓ}^{*} \cdot \frac{s _{m}}{s _{m - ℓ}}$

These are computed once at initialization and used directly as LP constraint matrix entries.

Key Properties

Periodicity: All parameters vary by season, matching the strong seasonality of hydrological data.
Parsimony: The model order $p$ is selected per season using AIC, BIC, or coefficient significance tests, avoiding unnecessary lags.
Stationarity: Fitted models are validated to ensure the AR process does not diverge — the characteristic polynomial roots must lie outside the unit circle.
Positive residual variance: After fitting, $σ_{m}^{2} > 0$ must hold for all seasons. A zero or negative residual variance indicates overfitting.

Cobre Methodology Reference