cobre-sddp
experimental
cobre-sddp implements the Stochastic Dual Dynamic Programming (SDDP) algorithm
(Pereira & Pinto, 1991) for long-term hydrothermal dispatch and energy planning.
It is the first algorithm vertical in the Cobre ecosystem: a training loop that
iteratively improves a piecewise-linear approximation of the value function for
multi-stage stochastic linear programs.
For the mathematical foundations — including the Benders decomposition, cut coefficient derivation, and risk measure theory — see the methodology reference.
This crate depends on cobre-core for system data types, cobre-stochastic for
inflow scenario generation, cobre-solver for LP subproblem solving, and
cobre-comm for distributed communication.
Iteration lifecycle
Each training iteration follows a fixed eight-step sequence. The ordering reflects the correction introduced in the lower bound plan fix (F-019): the lower bound is evaluated after the backward pass and cut synchronization, not during forward synchronization.
┌─────────────────────────────────────────────────────────────────────────┐
│ Step 1 Forward pass │
│ Each rank simulates config.forward_passes scenarios through │
│ all stages, solving the LP at each (scenario, stage) pair with │
│ the current FCF approximation. │
├─────────────────────────────────────────────────────────────────────────┤
│ Step 2 Forward sync │
│ allreduce (sum + broadcast) aggregates local UB statistics into │
│ a global mean, standard deviation, and 95% CI half-width. │
├─────────────────────────────────────────────────────────────────────────┤
│ Step 3 State exchange │
│ allgatherv gathers all ranks' trial point state vectors so │
│ every rank can solve the backward pass at ALL trial points. │
├─────────────────────────────────────────────────────────────────────────┤
│ Step 4 Backward pass │
│ Sweeps stages T-2 down to 0, solving the successor LP under │
│ every opening from the fixed tree, extracting LP duals to form │
│ Benders cut coefficients, and inserting one cut per trial point │
│ per stage into the Future Cost Function (FCF). │
├─────────────────────────────────────────────────────────────────────────┤
│ Step 5 Cut sync │
│ allgatherv shares each rank's newly generated cuts so that all │
│ ranks maintain an identical FCF at the end of each iteration. │
│ │
│ Step 5a Cut selection (optional) │
│ When a CutSelectionStrategy is configured, inactive cuts are │
│ pruned from the pool at multiples of check_frequency. │
│ │
│ Step 5b LB evaluation │
│ Rank 0 solves the stage-0 LP for every opening in the tree │
│ and aggregates the objectives via the stage-0 risk measure. │
│ The scalar lower bound is broadcast to all ranks. │
├─────────────────────────────────────────────────────────────────────────┤
│ Step 6 Convergence check │
│ The ConvergenceMonitor updates bound statistics and evaluates │
│ the configured stopping rules to determine whether to stop. │
├─────────────────────────────────────────────────────────────────────────┤
│ Step 7 Checkpoint (deferred) │
│ Periodic FCF checkpointing is planned for Phase 7. The MVP │
│ does not write intermediate checkpoints. │
├─────────────────────────────────────────────────────────────────────────┤
│ Step 8 Event emission │
│ TrainingEvent values are sent to the optional event channel │
│ for real-time monitoring by the CLI or TUI layer. │
└─────────────────────────────────────────────────────────────────────────┘
The convergence gap is computed as:
gap = (UB - LB) / max(1.0, |UB|)
The max(1.0, |UB|) guard prevents division by zero when the upper bound is
near zero.
Module overview
| Module | Responsibility |
|---|---|
training | train: the top-level loop orchestrator; wires all steps together |
forward | run_forward_pass, sync_forward: step 1 and step 2 |
state_exchange | ExchangeBuffers: step 3 allgatherv of trial point state vectors |
backward | run_backward_pass: step 4 Benders cut generation across all trial points |
cut_sync | CutSyncBuffers: step 5 allgatherv of new cut wire records |
cut_selection | CutSelectionStrategy, CutMetadata, DeactivationSet: step 5a pool pruning |
lower_bound | evaluate_lower_bound: step 5b risk-adjusted LB computation |
convergence | ConvergenceMonitor: step 6 bound tracking and stopping rule evaluation |
cut | CutPool, FutureCostFunction, CutWireHeader: cut data structures and wire format |
config | TrainingConfig: algorithm parameters |
stopping_rule | StoppingRule, StoppingRuleSet, MonitorState: termination criteria |
risk_measure | RiskMeasure, BackwardOutcome: risk-neutral and CVaR aggregation |
horizon_mode | HorizonMode: finite vs. cyclic stage traversal (only Finite in MVP) |
indexer | StageIndexer: LP column and row offset arithmetic for stage subproblems |
lp_builder | PatchBuffer, ar_dynamics_row_offset: row-bound patch arrays for LP solves |
trajectory | TrajectoryRecord: forward pass LP solution record (primal, dual, state, cost) |
error | SddpError: unified error type aggregating solver, comm, stochastic, and I/O errors |
Configuration
TrainingConfig
TrainingConfig controls the training loop parameters. All fields are public
and must be set explicitly — there is no Default implementation, preventing
silent misconfigurations.
| Field | Type | Description |
|---|---|---|
forward_passes | u32 | Scenarios per rank per iteration (must be >= 1) |
max_iterations | u64 | Safety bound on total iterations; also sizes the cut pool |
checkpoint_interval | Option<u64> | Write checkpoint every N iterations; None = disabled |
warm_start_cuts | u32 | Pre-loaded cuts from a policy file |
event_sender | Option<Sender<TrainingEvent>> | Channel for real-time monitoring events; None = silent |
use cobre_sddp::TrainingConfig;
let config = TrainingConfig {
forward_passes: 10,
max_iterations: 500,
checkpoint_interval: Some(50),
warm_start_cuts: 0,
event_sender: None,
};
StoppingRuleSet
The stopping rule set composes one or more termination criteria. Every set
must include an IterationLimit rule as a safety bound against infinite loops.
| Rule variant | Trigger condition |
|---|---|
IterationLimit | iteration >= limit |
TimeLimit | wall_time_seconds >= seconds |
BoundStalling | Relative LB improvement over a sliding window falls below tolerance |
SimulationBased | Periodic Monte Carlo simulation costs stabilize |
GracefulShutdown | External SIGTERM / SIGINT received (always evaluated first) |
The mode field controls how multiple rules combine:
StoppingMode::Any(OR): stop when any rule triggers.StoppingMode::All(AND): stop when all rules trigger simultaneously.
use cobre_sddp::stopping_rule::{StoppingMode, StoppingRule, StoppingRuleSet};
let stopping_rules = StoppingRuleSet {
rules: vec![
StoppingRule::IterationLimit { limit: 500 },
StoppingRule::BoundStalling {
tolerance: 0.001,
iterations: 20,
},
StoppingRule::GracefulShutdown,
],
mode: StoppingMode::Any,
};
RiskMeasure
RiskMeasure controls how per-opening backward pass outcomes are aggregated
into Benders cuts and how the lower bound is computed.
| Variant | Description |
|---|---|
Expectation | Risk-neutral expected value. Weights equal opening probabilities. |
CVaR | Convex combination (1 - λ)·E[Z] + λ·CVaR_α[Z]. alpha ∈ (0, 1], lambda ∈ [0, 1]. |
alpha = 1 with CVaR is equivalent to Expectation. lambda = 0 with
CVaR is also equivalent to Expectation. One RiskMeasure value is
assigned per stage from the stages.json configuration field risk_measure.
CutSelectionStrategy
Cut selection is optional. When configured, it periodically prunes the cut pool to control memory growth during long training runs.
| Variant | Deactivation condition |
|---|---|
Level1 | active_count <= threshold (never active; least aggressive) |
Lml1 | iteration - last_active_iter > memory_window (outside time window) |
Dominated | Dominated at all visited forward pass states (stub in MVP) |
All variants respect a check_frequency parameter: selection only runs at
iterations that are multiples of check_frequency and never at iteration 0.
Key data structures
FutureCostFunction
The Future Cost Function (FCF) holds one CutPool per stage. Each CutPool
is a pre-allocated flat array of cut slots. Cuts are inserted deterministically
by (iteration, forward_pass_index) to guarantee bit-for-bit identical FCF
state across all MPI ranks.
The FCF is built once before training begins. Total slot capacity is
warm_start_cuts + max_iterations * forward_passes * num_ranks per stage.
PatchBuffer
A PatchBuffer holds the three parallel arrays consumed by the LP solver’s
set_row_bounds call. It is sized for N * (2 + L) patches, where N is the
number of hydro plants and L is the maximum PAR order:
- Category 1
[0, N)— storage-fixing: equality constraint at incoming storage. - Category 2
[N, N*(1+L))— lag-fixing: equality constraint at AR lagged inflows. - Category 3
[N*(1+L), N*(2+L))— noise-fixing: equality constraint at scenario noise.
The backward pass uses only categories 1 and 2 (fill_state_patches).
The forward pass uses all three (fill_forward_patches).
ExchangeBuffers and CutSyncBuffers
Both types pre-allocate all communication buffers once at construction time and reuse them across all stages and iterations. This keeps the per-stage exchange allocation-free on the hot path.
ExchangeBuffers handles the state vector allgatherv (step 3):
- Send buffer:
local_count * n_statefloats. - Receive buffer:
local_count * num_ranks * n_statefloats (rank-major order).
CutSyncBuffers handles the cut wire allgatherv (step 5):
- Send buffer:
max_cuts_per_rank * cut_wire_size(n_state)bytes. - Receive buffer:
max_cuts_per_rank * num_ranks * cut_wire_size(n_state)bytes.
Convergence monitoring
ConvergenceMonitor tracks bound statistics and evaluates stopping rules. It
is constructed once before the loop begins and updated at the end of each
iteration via update(lb, &sync_result).
#![allow(unused)]
fn main() {
use cobre_sddp::convergence::ConvergenceMonitor;
use cobre_sddp::forward::SyncResult;
use cobre_sddp::stopping_rule::{StoppingMode, StoppingRule, StoppingRuleSet};
let rule_set = StoppingRuleSet {
rules: vec![StoppingRule::IterationLimit { limit: 100 }],
mode: StoppingMode::Any,
};
let mut monitor = ConvergenceMonitor::new(rule_set);
let sync = SyncResult {
global_ub_mean: 110.0,
global_ub_std: 5.0,
ci_95_half_width: 2.0,
sync_time_ms: 10,
};
let (stop, results) = monitor.update(100.0, &sync);
assert!(!stop);
assert_eq!(monitor.iteration_count(), 1);
// gap = (110 - 100) / max(1.0, 110.0) = 10/110
assert!((monitor.gap() - 10.0 / 110.0).abs() < 1e-10);
}
Accessor methods on ConvergenceMonitor:
| Method | Returns |
|---|---|
lower_bound() | Latest LB value |
upper_bound() | Latest UB mean |
upper_bound_std() | Latest UB standard deviation |
ci_95_half_width() | Latest 95% CI half-width |
gap() | Convergence gap: (UB - LB) / max(1.0, abs(UB)) |
iteration_count() | Number of completed update calls |
set_shutdown() | Signal a graceful shutdown before next update |
Event system
The training loop emits TrainingEvent values (from cobre-core) at each
lifecycle step boundary when config.event_sender is Some. Events carry
structured data for real-time display in the TUI or CLI layers.
Key events emitted during training:
| Event variant | When emitted |
|---|---|
ForwardPassComplete | After step 1 completes for all local scenarios |
ForwardSyncComplete | After step 2 global UB statistics are merged |
BackwardPassComplete | After step 4 cut generation for all trial points |
CutSyncComplete | After step 5 cut allgatherv |
CutSelectionComplete | After step 5a pool pruning (when strategy is set) |
LowerBoundEvaluated | After step 5b LB broadcast |
IterationSummary | At the end of each iteration (LB, UB, gap, timing) |
TrainingFinished | When a stopping rule triggers |
Quick start (pseudocode)
The following shows the shape of a train call. All arguments must be built
from the upstream pipeline (cobre-io for system data, cobre-stochastic for
the opening tree, cobre-solver for the LP solver instance).
use cobre_sddp::{
FutureCostFunction, HorizonMode, RiskMeasure, StageIndexer,
TrainingConfig, TrainingResult,
stopping_rule::{StoppingMode, StoppingRule, StoppingRuleSet},
train,
};
// Build the FCF for num_stages stages, n_state state dimensions,
// forward_passes scenarios per rank, max_iterations iterations.
let mut fcf = FutureCostFunction::new(num_stages, n_state, forward_passes, max_iterations, 0);
let config = TrainingConfig {
forward_passes: 10,
max_iterations: 500,
checkpoint_interval: None,
warm_start_cuts: 0,
event_sender: None,
};
let stopping_rules = StoppingRuleSet {
rules: vec![
StoppingRule::IterationLimit { limit: 500 },
StoppingRule::GracefulShutdown,
],
mode: StoppingMode::Any,
};
let horizon = HorizonMode::Finite { num_stages };
let result: TrainingResult = train(
&mut solver, // SolverInterface impl (e.g., HiGHS)
config,
&mut fcf,
&templates, // one StageTemplate per stage
&base_rows, // AR dynamics base row index per stage
&indexer, // StageIndexer from StageIndexer::new(n_hydro, max_par_order)
&initial_state, // known initial storage volumes
&opening_tree, // from cobre_stochastic::build_stochastic_context
&stochastic, // StochasticContext
&horizon,
&risk_measures, // one RiskMeasure per stage
stopping_rules,
None, // no cut selection in this example
None, // no external shutdown flag
&comm, // Communicator (LocalBackend or FerrompiBackend)
)?;
println!(
"Converged in {} iterations: LB={:.2}, UB={:.2}, gap={:.4}",
result.iterations, result.final_lb, result.final_ub, result.final_gap
);
Error handling
All fallible operations return Result<T, SddpError>. The error type is
Send + Sync + 'static and can be propagated across thread boundaries or
wrapped by anyhow.
SddpError variant | Trigger |
|---|---|
Infeasible | LP has no feasible solution (stage, iteration, scenario) |
Solver | LP solve failed for numerical or timeout reasons |
Communication | MPI collective operation failed |
Stochastic | Scenario generation or PAR model validation failed |
Io | Case directory loading or validation failed |
Validation | Algorithm configuration is semantically invalid |
Performance notes
Pre-allocation discipline
The training loop makes no heap allocations on the hot path inside the iteration loop. All workspace buffers are allocated once before the loop:
TrajectoryRecordflat vec:forward_passes * num_stagesrecords.PatchBuffer:N * (2 + L)entries.ExchangeBuffers:local_count * num_ranks * n_statefloats.CutSyncBuffers:max_cuts_per_rank * num_ranks * cut_wire_size(n_state)bytes.
Cut wire format
The cut wire format used by CutSyncBuffers is a fixed-size record:
24 bytes of header (slot index, iteration, forward pass index, intercept)
followed by n_state * 8 bytes of coefficients. The record size is
cut_wire_size(n_state) = 24 + n_state * 8 bytes.
Communication-free parallelism
Forward pass noise is generated without inter-rank communication. Each rank
independently derives its noise seed from (base_seed, iteration, scenario, stage_id)
using SipHash-1-3 (DEC-017 from cobre-stochastic). The opening tree is
pre-generated once before training and shared read-only across all iterations.
Testing
cargo test -p cobre-sddp --all-features
The crate requires no external system libraries beyond what is needed by the
workspace (HiGHS is always available; MPI is optional via the mpi feature
of cobre-comm).
Test suite overview
The crate has tests across 15 source modules covering:
- Unit tests for each module’s core logic.
- Integration tests using
LocalBackend(single-rank) for the communication-involving modules (forward,backward,cut_sync,state_exchange,lower_bound,training). - Doc-tests for all public types and functions with constructible examples.
Feature flags
cobre-sddp has no optional feature flags of its own. Feature flag propagation
from cobre-comm (the mpi feature) controls whether MPI-based distributed
training is available at link time.
# Cargo.toml
cobre-sddp = { version = "0.0.1" }