Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Cobre

Open infrastructure for power system computation. Built in Rust.

Cobre is an ecosystem of Rust crates for power system analysis and optimization. The first solver vertical implements Stochastic Dual Dynamic Programming (SDDP) for long-term hydrothermal dispatch – a problem central to energy planning in systems with large hydroelectric capacity.

Design goals

  • Production-grade HPC: hybrid MPI + thread parallelism, designed for cluster execution via mpiexec
  • Reproducible results: deterministic output regardless of rank count, thread count, or input ordering
  • Modular architecture: 11 crates with clean boundaries, each independently testable
  • Open solver stack: HiGHS LP solver, no proprietary dependencies for core functionality

Current status

Phases 1 through 6 are complete. The ecosystem delivers a full SDDP training pipeline: entity model and topology validation (cobre-core), JSON/Parquet case loading with 5-layer validation (cobre-io), LP solver abstraction with HiGHS backend and warm-start basis management (cobre-solver), pluggable communication with MPI and local backends (cobre-comm), PAR(p) inflow models with deterministic correlated scenario generation (cobre-stochastic), and the SDDP training loop with forward/backward passes, Benders cut generation, cut synchronization, convergence monitoring, and composite stopping rules (cobre-sddp) – verified by 1490 tests across the workspace. Implementation continues through the 8-phase build sequence, with Phase 7 (simulation + output) as the next candidate.

GitHubgithub.com/cobre-rs/cobre
API docs (rustdoc)cargo doc --workspace --no-deps --open
Methodology referencecobre-rs.github.io/cobre-docs
LicenseApache-2.0

Installation

This page gets you to a working cobre installation in the shortest path possible. For alternative methods — including cargo install, building from source, and the full platform support table — see Installation (User Guide).


Fastest Path: Pre-built Binary

Download and install the pre-built binary for your platform with a single command.

Linux and macOS

curl --proto '=https' --tlsv1.2 -LsSf https://github.com/cobre-rs/cobre/releases/latest/download/cobre-cli-installer.sh | sh

Windows (PowerShell)

powershell -ExecutionPolicy Bypass -c "irm https://github.com/cobre-rs/cobre/releases/latest/download/cobre-cli-installer.ps1 | iex"

The installer places cobre in $CARGO_HOME/bin (typically ~/.cargo/bin). Ensure that directory is in your PATH.


Verify the Installation

cobre version

Expected output:

cobre   v0.1.0
solver: HiGHS
comm:   local
zstd:   enabled
arch:   x86_64-linux
build:  release (lto=thin)

The exact version, arch, and build fields will vary by platform and release.


Next Steps

Quickstart

This page takes you from zero to a completed SDDP study in three commands using the built-in 1dtoy template. The template models a single-bus hydrothermal system with one hydro plant and two thermal units over a 4-stage finite planning horizon — small enough to run in seconds, complete enough to demonstrate every stage of the workflow.

If you have not installed Cobre yet, start with Installation.

Quick Start Demo


Step 1: Scaffold a Case Directory

cobre init --template 1dtoy my_first_study

Cobre writes 10 input files into a new my_first_study/ directory and prints a summary to stderr:

 ━━━━━━━━━━━●
 ━━━━━━━━━━━●⚡  COBRE v0.1.0
 ━━━━━━━━━━━●   Power systems in Rust

Created my_first_study case directory from template '1dtoy':

  ✔ config.json                    Algorithm configuration: training (forward passes, stopping rules) and simulation settings
  ✔ initial_conditions.json        Initial reservoir storage volumes for each hydro plant at the start of the planning horizon
  ✔ penalties.json                 Global penalty costs for constraint violations (deficit, excess, spillage, storage bounds, etc.)
  ✔ stages.json                    Planning horizon definition: policy graph type, discount rate, stage dates, time blocks, and scenario counts
  ✔ system/buses.json              Electrical bus definitions with deficit cost segments
  ✔ system/hydros.json             Hydro plant definitions: reservoir bounds, outflow limits, turbine model, and generation limits
  ✔ system/lines.json              Transmission line definitions (empty in this single-bus example)
  ✔ system/thermals.json           Thermal plant definitions with piecewise cost segments and generation bounds
  ✔ scenarios/inflow_seasonal_stats.parquet  Seasonal PAR(p) statistics for hydro inflow scenario generation (mean, std, lag correlations)
  ✔ scenarios/load_seasonal_stats.parquet    Seasonal PAR(p) statistics for electrical load scenario generation (mean, std, lag correlations)

Next steps:
  -> cobre validate my_first_study
  -> cobre run my_first_study --output my_first_study/results

The directory structure is:

my_first_study/
  config.json
  initial_conditions.json
  penalties.json
  stages.json
  system/
    buses.json
    hydros.json
    lines.json
    thermals.json
  scenarios/
    inflow_seasonal_stats.parquet
    load_seasonal_stats.parquet

Step 2: Validate the Case

cobre validate my_first_study

The validation pipeline checks all five layers — schema, references, physical feasibility, stochastic consistency, and solver feasibility — and prints entity counts on success:

Valid case: 1 buses, 1 hydros, 2 thermals, 0 lines
  buses: 1
  hydros: 1
  thermals: 2
  lines: 0

If any layer fails, Cobre prints each error prefixed with error: and exits with code 1. The 1dtoy template always passes validation.


Step 3: Run the Study

cobre run my_first_study --output my_first_study/results

Cobre runs the SDDP training loop (128 iterations, 1 forward pass each) followed by a simulation pass (100 scenarios). Output is written to my_first_study/results/. The banner, a progress bar, and a post-run summary are printed to stderr:

 ━━━━━━━━━━━●
 ━━━━━━━━━━━●⚡  COBRE v0.1.0
 ━━━━━━━━━━━●   Power systems in Rust

Training complete in 3.2s (128 iterations, converged at iter 94)
  Lower bound:  142.3 $/stage
  Upper bound:  143.1 +/- 1.2 $/stage
  Gap:          0.6%
  Cuts:         94 active / 94 generated
  LP solves:    512

Simulation complete (100 scenarios)
  Completed: 100  Failed: 0

Output written to my_first_study/results/

Exact numerical values (bounds, gap, cut counts, timing) will vary across runs because scenario sampling is stochastic. The gap and iteration count depend on the random seed and the convergence tolerance configured in config.json.

The results directory contains Hive-partitioned Parquet files for costs, hydro dispatch, thermal dispatch, and bus balance, plus a FlatBuffers policy checkpoint:

my_first_study/results/
  policy/
    cuts/
      stage_000.bin  ...  stage_003.bin
    basis/
      stage_000.bin  ...  stage_003.bin
    metadata.json
  simulation/
    costs/
    hydros/
    thermals/
    buses/

What’s Next

You have completed a full SDDP study from case setup to results. The following pages go deeper into how the case is structured and how to interpret the output:

Anatomy of a Case

A Cobre case directory is a self-contained folder of input files. When you run cobre run or cobre validate, the first thing Cobre does is call load_case on that directory. load_case reads every file, runs the five-layer validation pipeline (schema, references, physical feasibility, stochastic consistency, solver feasibility), and produces a fully-validated System object ready for the solver.

This page walks through every file in the 1dtoy example, explaining what each field controls and why it matters. The example lives in examples/1dtoy/ in the repository and is also available via cobre init --template 1dtoy.

For the complete field-by-field schema reference, see Case Format Reference.


Directory Structure

The 1dtoy case contains 10 input files across three directories:

1dtoy/
  config.json
  initial_conditions.json
  penalties.json
  stages.json
  system/
    buses.json
    hydros.json
    lines.json
    thermals.json
  scenarios/
    inflow_seasonal_stats.parquet
    load_seasonal_stats.parquet

The four root-level files configure the solver and define the time horizon. The system/ subdirectory holds the power system entities. The scenarios/ subdirectory holds the stochastic input data that drives scenario generation.


Root-Level Files

config.json

config.json controls all solver parameters: how many training iterations to run, when to stop, whether to follow training with a simulation pass, and more.

{
  "version": "1.0.0",
  "training": {
    "forward_passes": 1,
    "stopping_rules": [
      {
        "type": "iteration_limit",
        "limit": 128
      }
    ]
  },
  "simulation": {
    "enabled": true,
    "num_scenarios": 100
  }
}

version is an informational string; it does not affect behavior.

The training section is mandatory. forward_passes: 1 means each training iteration draws one scenario trajectory. The stopping_rules array must contain at least one iteration_limit rule. Here the solver stops after 128 iterations. For production studies you would typically also add a convergence-based stopping rule such as bound_stalling, but for a small tutorial case an iteration limit is sufficient.

The simulation section is optional and defaults to disabled. Here it is enabled with 100 scenarios. After training completes, Cobre evaluates the trained policy over 100 independently sampled scenarios and writes the results to the output directory.

For the full list of configuration options, see Configuration.


penalties.json

penalties.json defines the global penalty cost defaults. These costs are added to the LP objective whenever a physical constraint is violated in a soft-constraint sense — for example, when demand cannot be fully served (deficit) or when a reservoir bound is violated. Setting these costs high relative to actual generation costs ensures that violations are used as a last resort rather than a cheap dispatch option.

{
  "bus": {
    "deficit_segments": [
      {
        "depth_mw": 500.0,
        "cost": 1000.0
      },
      {
        "depth_mw": null,
        "cost": 5000.0
      }
    ],
    "excess_cost": 100.0
  },
  "line": {
    "exchange_cost": 2.0
  },
  "hydro": {
    "spillage_cost": 0.01,
    "fpha_turbined_cost": 0.05,
    "diversion_cost": 0.1,
    "storage_violation_below_cost": 10000.0,
    "filling_target_violation_cost": 50000.0,
    "turbined_violation_below_cost": 500.0,
    "outflow_violation_below_cost": 500.0,
    "outflow_violation_above_cost": 500.0,
    "generation_violation_below_cost": 1000.0,
    "evaporation_violation_cost": 5000.0,
    "water_withdrawal_violation_cost": 1000.0
  },
  "non_controllable_source": {
    "curtailment_cost": 0.005
  }
}

The bus.deficit_segments array defines a piecewise-linear deficit cost curve. The first segment covers the first 500 MW of unserved energy at 1000 $/MWh. Beyond 500 MW, the cost rises to 5000 $/MWh (the segment with depth_mw: null is always the final unbounded tier). The two-tier structure mimics a typical Value of Lost Load model where the first tranche represents interruptible load and the second represents non-interruptible load. excess_cost penalizes over-injection at 100 $/MWh.

Hydro penalty costs cover a range of operational constraint violations. The low spillage_cost (0.01 $/hm3) makes spillage the cheapest way to release water when turbine capacity is exhausted. The high storage_violation_below_cost (10,000 $/hm3) and filling_target_violation_cost (50,000 $/hm3) make reservoir bound violations extremely costly, ensuring the solver strongly avoids them.

Individual entities can override these global defaults in their own JSON files using a penalties block. The reference page documents all override options.


stages.json

stages.json defines the temporal structure of the study: the sequence of planning stages, the load blocks within each stage, the number of scenarios to sample at each stage during training, and the policy graph horizon type.

{
  "policy_graph": {
    "type": "finite_horizon",
    "annual_discount_rate": 0.12
  },
  "stages": [
    {
      "id": 0,
      "start_date": "2024-01-01",
      "end_date": "2024-02-01",
      "blocks": [
        {
          "id": 0,
          "name": "SINGLE",
          "hours": 744
        }
      ],
      "num_scenarios": 10
    },
    {
      "id": 1,
      "start_date": "2024-02-01",
      "end_date": "2024-03-01",
      "blocks": [
        {
          "id": 0,
          "name": "SINGLE",
          "hours": 696
        }
      ],
      "num_scenarios": 10
    },
    {
      "id": 2,
      "start_date": "2024-03-01",
      "end_date": "2024-04-01",
      "blocks": [
        {
          "id": 0,
          "name": "SINGLE",
          "hours": 744
        }
      ],
      "num_scenarios": 10
    },
    {
      "id": 3,
      "start_date": "2024-04-01",
      "end_date": "2024-05-01",
      "blocks": [
        {
          "id": 0,
          "name": "SINGLE",
          "hours": 720
        }
      ],
      "num_scenarios": 10
    }
  ]
}

policy_graph.type: "finite_horizon" means the planning horizon is a linear sequence of stages with no cyclic structure and zero terminal value after the last stage. The annual_discount_rate: 0.12 applies a 12% annual discount to future stage costs.

The stages array defines four monthly stages covering January through April 2024. Each stage has a single load block named SINGLE that spans the entire month. The hours values match the actual number of hours in each calendar month (744 for January, 696 for February in 2024, and so on). These hours are used when converting power (MW) to energy (MWh) in the LP objective.

num_scenarios: 10 means 10 scenario trajectories are sampled at each stage during training forward passes. A small number like 10 is sufficient for a tutorial; real studies typically use 50 or more.


initial_conditions.json

initial_conditions.json provides the reservoir storage levels at the beginning of the study. Every hydro plant that participates in the study must have an entry here.

{
  "storage": [
    {
      "hydro_id": 0,
      "value_hm3": 83.222
    }
  ],
  "filling_storage": []
}

storage covers operating reservoirs: plants that both generate power and store water between stages. hydro_id: 0 corresponds to UHE1 defined in system/hydros.json. The initial storage is 83.222 hm³, which is about 8.3% of the 1000 hm³ maximum capacity — a low-storage starting condition that forces the solver to balance generation against the risk of running dry.

filling_storage covers filling reservoirs — reservoirs that do not generate power but feed downstream plants. The 1dtoy case has no filling reservoirs, so this array is empty. It must still be present (even if empty) to satisfy the schema.


system/ Files

system/buses.json

Buses are the nodes of the electrical network. Every generator and load is connected to a bus. The bus balance constraint ensures that injections equal withdrawals at every bus in every LP solve.

{
  "buses": [
    {
      "id": 0,
      "name": "SIN",
      "deficit_segments": [
        {
          "depth_mw": null,
          "cost": 1000.0
        }
      ]
    }
  ]
}

The 1dtoy case has a single bus named SIN (Sistema Interligado Nacional, the Brazilian interconnected system). A single-bus model treats the entire system as one copper-plate node: there are no transmission constraints.

The bus-level deficit_segments here overrides the global default from penalties.json with a simpler single-tier structure: unlimited deficit at 1000 $/MWh. When an entity-level override is present, it takes precedence over the global default.


system/lines.json

Transmission lines connect pairs of buses and carry power flows subject to capacity limits. In a single-bus model, no lines are needed.

{
  "lines": []
}

The file must be present even if the lines array is empty. The validator checks for the file and would raise a schema error if it were absent.


system/hydros.json

Hydro plants have a reservoir (water storage), a turbine (converts water flow to electricity), and optional cascade linkage to downstream plants.

{
  "hydros": [
    {
      "id": 0,
      "name": "UHE1",
      "bus_id": 0,
      "downstream_id": null,
      "reservoir": {
        "min_storage_hm3": 0.0,
        "max_storage_hm3": 1000.0
      },
      "outflow": {
        "min_outflow_m3s": 0.0,
        "max_outflow_m3s": 50.0
      },
      "generation": {
        "model": "constant_productivity",
        "productivity_mw_per_m3s": 1.0,
        "min_turbined_m3s": 0.0,
        "max_turbined_m3s": 50.0,
        "min_generation_mw": 0.0,
        "max_generation_mw": 50.0
      }
    }
  ]
}

UHE1 connects to bus 0 (SIN). downstream_id: null means it is a tailwater plant — there is no plant downstream that receives its outflow.

The reservoir block defines storage bounds in hm³ (cubic hectometres). UHE1 can hold between 0 and 1000 hm³. The minimum of 0 means the reservoir can be fully emptied, which is common for run-of-river-adjacent plants.

The outflow block limits total outflow (turbined + spilled) to 50 m³/s maximum. This is a physical constraint representing the river channel capacity below the dam.

The generation block uses "constant_productivity", the simplest turbine model: generation (MW) equals turbined flow (m³/s) times the productivity_mw_per_m3s factor. Here the factor is 1.0, so 1 m³/s of turbined flow yields 1 MW. The turbine can pass between 0 and 50 m³/s, and the resulting generation is bounded between 0 and 50 MW.


system/thermals.json

Thermal plants are dispatchable generators with a fixed cost per MWh. The piecewise cost structure allows modeling fuel cost curves by defining multiple capacity segments at increasing costs.

{
  "thermals": [
    {
      "id": 0,
      "name": "UTE1",
      "bus_id": 0,
      "cost_segments": [
        {
          "capacity_mw": 15.0,
          "cost_per_mwh": 5.0
        }
      ],
      "generation": {
        "min_mw": 0.0,
        "max_mw": 15.0
      }
    },
    {
      "id": 1,
      "name": "UTE2",
      "bus_id": 0,
      "cost_segments": [
        {
          "capacity_mw": 15.0,
          "cost_per_mwh": 10.0
        }
      ],
      "generation": {
        "min_mw": 0.0,
        "max_mw": 15.0
      }
    }
  ]
}

Both thermal plants connect to bus 0. UTE1 is the cheaper unit at 5 $/MWh and UTE2 costs 10 $/MWh. Both are limited to 15 MW maximum dispatch. In the LP, Cobre will always prefer UTE1 over UTE2 and prefer both over deficit (1000 $/MWh), creating a natural merit-order dispatch.

Each thermal has a single cost segment covering its entire capacity. For plants with variable heat rates you would add additional segments — for example, { "capacity_mw": 10.0, "cost_per_mwh": 8.0 } followed by { "capacity_mw": 5.0, "cost_per_mwh": 12.0 } to model a plant that becomes progressively more expensive at higher output.


scenarios/ Files

The scenarios/ directory holds Parquet files that parameterize the stochastic models used to generate inflow and load scenarios during training and simulation. Unlike the JSON files, these are binary columnar files that cannot be inspected with a text editor.

scenarios/inflow_seasonal_stats.parquet

This file contains the seasonal mean and standard deviation of historical inflows for each (hydro plant, stage) pair, plus the autoregressive order for the PAR(p) model. Cobre uses these statistics to fit a periodic autoregressive model that generates correlated inflow scenarios across stages.

Expected columns:

ColumnTypeDescription
hydro_idINT32Hydro plant identifier (matches id in hydros.json)
stage_idINT32Stage identifier (matches id in stages.json)
mean_m3sDOUBLESeasonal mean inflow in m³/s
std_m3sDOUBLESeasonal standard deviation in m³/s (must be >= 0)
ar_orderINT32Number of AR lags in the PAR(p) model (0 = white noise)

The 1dtoy file has 4 rows, one for each stage, for the single hydro plant UHE1 (hydro_id = 0). When ar_order > 0, Cobre also looks for an inflow_ar_coefficients.parquet file containing the lag coefficients. The 1dtoy case uses ar_order = 0 (white noise), so no coefficients file is needed.

To inspect a Parquet file on your machine, use any of:

import polars as pl
df = pl.read_parquet("scenarios/inflow_seasonal_stats.parquet")
print(df)
import pandas as pd
df = pd.read_parquet("scenarios/inflow_seasonal_stats.parquet")
print(df)
-- DuckDB
SELECT * FROM read_parquet('scenarios/inflow_seasonal_stats.parquet');

scenarios/load_seasonal_stats.parquet

This file contains the seasonal statistics for electrical load at each bus. It drives the stochastic load model that generates demand scenarios during training and simulation.

Expected columns:

ColumnTypeDescription
bus_idINT32Bus identifier (matches id in buses.json)
stage_idINT32Stage identifier (matches id in stages.json)
mean_mwDOUBLESeasonal mean load in MW
std_mwDOUBLESeasonal standard deviation in MW (must be >= 0)
ar_orderINT32Number of AR lags in the PAR(p) model (0 = white noise)

The 1dtoy file has 4 rows, one for each stage, for the single bus SIN (bus_id = 0). The load mean and standard deviation determine how much demand the system must serve in each scenario and how uncertain that demand is.


What’s Next

Now that you understand what each file does, the next page walks you through creating a case from scratch:

Building a System

This page walks you through creating a minimal case directory from scratch, explaining why each file exists and what each field controls. The target is a single-bus hydrothermal system identical to the 1dtoy template: one bus, one hydro plant, two thermal units, and a four-month planning horizon.

If you want to start from a working template instead, use:

cobre init --template 1dtoy my_study

This page is for users who want to understand the structure of every file before touching real data.


Prerequisites

Create an empty directory and enter it:

mkdir my_study
cd my_study
mkdir system

You will need 8 JSON files. By the end of this guide your directory will look like:

my_study/
  config.json
  initial_conditions.json
  penalties.json
  stages.json
  system/
    buses.json
    hydros.json
    lines.json
    thermals.json

The scenarios/ subdirectory is optional for a minimal case. Cobre can generate white-noise inflow and load scenarios using only the stage definitions, without Parquet statistics files.


Step 1: Create config.json

config.json tells Cobre how to run the study. At minimum it needs a training section with a forward_passes count and at least one stopping_rules entry.

Create my_study/config.json:

{
  "training": {
    "forward_passes": 1,
    "stopping_rules": [
      {
        "type": "iteration_limit",
        "limit": 128
      }
    ]
  },
  "simulation": {
    "enabled": true,
    "num_scenarios": 100
  }
}

forward_passes controls how many scenario trajectories are drawn per training iteration. Start with 1 for fast iteration during case development; increase to 50 or more for production runs where you want lower variance per iteration.

stopping_rules must contain at least one iteration_limit entry. The solver will run until one of the configured rules triggers. Here it stops after 128 iterations regardless of convergence. You can add a second rule — for example, { "type": "time_limit", "seconds": 300 } — and the solver will stop when either condition is met.

The simulation block is optional. When enabled: true, Cobre runs a post-training simulation pass using num_scenarios independently sampled scenarios and writes dispatch results to Parquet files.

For the full list of configuration options including warm-start, cut selection, and output controls, see Configuration.


Step 2: Create stages.json

stages.json defines the time horizon. Each stage represents a planning period. The solver builds one LP sub-problem per stage per scenario trajectory.

Create my_study/stages.json:

{
  "policy_graph": {
    "type": "finite_horizon",
    "annual_discount_rate": 0.12
  },
  "stages": [
    {
      "id": 0,
      "start_date": "2024-01-01",
      "end_date": "2024-02-01",
      "blocks": [
        {
          "id": 0,
          "name": "SINGLE",
          "hours": 744
        }
      ],
      "num_scenarios": 10
    },
    {
      "id": 1,
      "start_date": "2024-02-01",
      "end_date": "2024-03-01",
      "blocks": [
        {
          "id": 0,
          "name": "SINGLE",
          "hours": 696
        }
      ],
      "num_scenarios": 10
    },
    {
      "id": 2,
      "start_date": "2024-03-01",
      "end_date": "2024-04-01",
      "blocks": [
        {
          "id": 0,
          "name": "SINGLE",
          "hours": 744
        }
      ],
      "num_scenarios": 10
    },
    {
      "id": 3,
      "start_date": "2024-04-01",
      "end_date": "2024-05-01",
      "blocks": [
        {
          "id": 0,
          "name": "SINGLE",
          "hours": 720
        }
      ],
      "num_scenarios": 10
    }
  ]
}

policy_graph.type: "finite_horizon" is the correct choice for a planning horizon with a definite end date and no cycling. The annual_discount_rate is applied to discount future stage costs back to present value. A rate of 0.12 means costs one year in the future are worth 88% of present costs.

Each stage entry needs an id (0-indexed integer), a start_date and end_date in ISO 8601 format, an array of blocks, and a num_scenarios count.

The blocks array subdivides a stage into load periods. A single block named SINGLE that spans all the hours of the month is the simplest choice. More detailed studies use two or three blocks (peak/off-peak/overnight) to capture intra-stage load variation. The hours value must equal the actual number of hours in the stage: these hours convert MW dispatch levels to MWh costs in the LP objective.

num_scenarios is the number of inflow/load scenario trajectories sampled at each stage during training. More scenarios per iteration produce less-noisy cut estimates at the cost of more LP solves per iteration.


Step 3: Create penalties.json

Penalty costs define how much the solver pays when it cannot satisfy a constraint without violating a physical bound. High penalties make violations expensive so the solver avoids them; low penalties on minor constraints (like spillage) allow the solver to use flexibility when needed.

Create my_study/penalties.json:

{
  "bus": {
    "deficit_segments": [
      {
        "depth_mw": 500.0,
        "cost": 1000.0
      },
      {
        "depth_mw": null,
        "cost": 5000.0
      }
    ],
    "excess_cost": 100.0
  },
  "line": {
    "exchange_cost": 2.0
  },
  "hydro": {
    "spillage_cost": 0.01,
    "fpha_turbined_cost": 0.05,
    "diversion_cost": 0.1,
    "storage_violation_below_cost": 10000.0,
    "filling_target_violation_cost": 50000.0,
    "turbined_violation_below_cost": 500.0,
    "outflow_violation_below_cost": 500.0,
    "outflow_violation_above_cost": 500.0,
    "generation_violation_below_cost": 1000.0,
    "evaporation_violation_cost": 5000.0,
    "water_withdrawal_violation_cost": 1000.0
  },
  "non_controllable_source": {
    "curtailment_cost": 0.005
  }
}

The bus.deficit_segments array must end with a segment where depth_mw is null. This unbounded final segment ensures the LP always has a feasible solution even when generation capacity is insufficient to cover load. All four top-level sections (bus, line, hydro, non_controllable_source) are required even if your system contains none of that entity type.

Individual penalty values can be overridden per entity by adding a penalties block inside any entity definition in the system/ files. The global values here serve as the default for any entity that does not specify its own.


Step 4: Create system/buses.json

A bus is an electrical node. All generators and loads connect to a bus. Every system needs at least one bus.

Create my_study/system/buses.json:

{
  "buses": [
    {
      "id": 0,
      "name": "SIN",
      "deficit_segments": [
        {
          "depth_mw": null,
          "cost": 1000.0
        }
      ]
    }
  ]
}

id must be a unique non-negative integer. name is a human-readable label used in output files and validation messages. The deficit_segments override here replaces the global deficit curve from penalties.json for this specific bus. A single unbounded segment at 1000 $/MWh is the simplest possible deficit model.

If you omit deficit_segments from a bus, Cobre uses the global default from penalties.json for that bus. Explicit overrides are useful when different buses have different Value of Lost Load characteristics.


Step 5: Create system/lines.json

Transmission lines connect pairs of buses and impose flow limits between them. A single-bus system has no lines.

Create my_study/system/lines.json:

{
  "lines": []
}

The file must exist even with an empty array. The validator checks that the file is present and that its schema is valid. If you later add a second bus, you can add lines here by specifying source_bus_id, target_bus_id, direct_mw, and reverse_mw for each line.


Step 6: Create system/thermals.json

Thermal plants are dispatchable generators. They have a fixed cost per MWh of generation and physical capacity bounds. Add them in increasing cost order as a matter of convention, though the LP will find the optimal merit order regardless.

Create my_study/system/thermals.json:

{
  "thermals": [
    {
      "id": 0,
      "name": "UTE1",
      "bus_id": 0,
      "cost_segments": [
        {
          "capacity_mw": 15.0,
          "cost_per_mwh": 5.0
        }
      ],
      "generation": {
        "min_mw": 0.0,
        "max_mw": 15.0
      }
    },
    {
      "id": 1,
      "name": "UTE2",
      "bus_id": 0,
      "cost_segments": [
        {
          "capacity_mw": 15.0,
          "cost_per_mwh": 10.0
        }
      ],
      "generation": {
        "min_mw": 0.0,
        "max_mw": 15.0
      }
    }
  ]
}

bus_id: 0 connects both plants to the SIN bus. The cost_segments array defines a piecewise-linear cost curve. Each segment has a capacity_mw and a cost_per_mwh. With a single segment, the entire capacity is available at the same cost. The segment capacities should sum to generation.max_mw.

generation.min_mw: 0.0 means the plant can be turned off completely. A non-zero minimum would represent a must-run commitment constraint. max_mw caps the generation level and should equal the sum of all cost_segments capacities.

The bus_id must reference a bus id defined in buses.json. The validator will catch any broken reference and report it as a reference integrity error.


Step 7: Create system/hydros.json

Hydro plants have three components: a reservoir (state variable between stages), a turbine (converts water flow to electricity), and optional cascade linkage to downstream plants.

Create my_study/system/hydros.json:

{
  "hydros": [
    {
      "id": 0,
      "name": "UHE1",
      "bus_id": 0,
      "downstream_id": null,
      "reservoir": {
        "min_storage_hm3": 0.0,
        "max_storage_hm3": 1000.0
      },
      "outflow": {
        "min_outflow_m3s": 0.0,
        "max_outflow_m3s": 50.0
      },
      "generation": {
        "model": "constant_productivity",
        "productivity_mw_per_m3s": 1.0,
        "min_turbined_m3s": 0.0,
        "max_turbined_m3s": 50.0,
        "min_generation_mw": 0.0,
        "max_generation_mw": 50.0
      }
    }
  ]
}

downstream_id: null marks UHE1 as a tailwater plant. To model a cascade where plant A flows into plant B, you would set downstream_id: <B's id> on plant A. Cobre enforces that the downstream graph is acyclic.

The reservoir block uses hm³ (cubic hectometres) as the unit for water volume. min_storage_hm3: 0.0 allows the reservoir to empty completely. If your plant has a dead storage (volume below the turbine intake), set min_storage_hm3 to that value.

The outflow block limits total outflow (turbined flow plus spillage). The upper bound max_outflow_m3s: 50.0 models the river channel capacity. Setting a non-zero min_outflow_m3s would represent a minimum ecological flow requirement.

The generation block uses "constant_productivity" which is the only supported model for the current release. The productivity_mw_per_m3s factor converts turbined flow to generated power. Here 1 m³/s yields 1 MW. Real plants typically have productivity factors between 0.5 and 10 depending on the head height.


Step 8: Create initial_conditions.json

Every hydro plant needs an initial reservoir storage value at the start of the study. This is the state the solver uses for stage 0’s water balance equation.

Create my_study/initial_conditions.json:

{
  "storage": [
    {
      "hydro_id": 0,
      "value_hm3": 83.222
    }
  ],
  "filling_storage": []
}

hydro_id: 0 matches UHE1 defined in system/hydros.json. Every hydro plant in the system must have exactly one entry in either storage or filling_storage — not both, not neither. The validator checks this.

value_hm3: 83.222 sets the initial reservoir at about 8.3% of its 1000 hm³ capacity. Choosing a realistic initial condition matters for short horizons because the first few stages will be heavily influenced by whether the reservoir starts full or nearly empty. For multi-year studies the initial condition has less impact on later stages.

filling_storage is for filling reservoirs — reservoirs that accumulate water but do not generate power. The 1dtoy system has none, so this array is empty. It must be present even when empty.


Step 9: Validate Your Case

With all 8 files in place, validate the case to confirm every layer passes:

cobre validate my_study

On success, Cobre prints the entity counts:

Valid case: 1 buses, 1 hydros, 2 thermals, 0 lines
  buses: 1
  hydros: 1
  thermals: 2
  lines: 0

If any validation layer fails, each error is prefixed with error: and the exit code is 1. Common errors at this stage:

  • reference error: hydro 0 references bus 99 which does not exist — a bus_id in hydros.json does not match any id in buses.json.
  • initial conditions: hydro 0 has no initial storage entry — a hydro plant in hydros.json is missing from initial_conditions.json.
  • penalties.json: non_controllable_source section missing — a required top-level section is absent from penalties.json, even if the system has no NCS plants.

Fix each reported error and re-run cobre validate until the exit code is 0.


What’s Next

Your hand-built case is functionally identical to the 1dtoy template. You can run it directly:

cobre run my_study --output my_study/results

To compare your files against the template at any point:

cobre init --template 1dtoy 1dtoy_reference
diff -r my_study 1dtoy_reference

From here, the natural next steps are:

Understanding Results

After cobre run completes, the output directory contains three categories of artifacts: training convergence data, a saved policy checkpoint, and simulation dispatch results. This page explains how to read each category and how to query the results programmatically using cobre report.

If you have not yet run the quickstart, complete Quickstart first — this page references the my_first_study/results/ directory produced by that walkthrough.


The Post-Run Summary

When cobre run finishes, it prints a summary block to stderr. The 1dtoy run from the quickstart produces output similar to:

Training complete in 3.2s (128 iterations, iteration_limit)
  Lower bound:  142.3 $/stage
  Upper bound:  143.1 +/- 1.2 $/stage
  Gap:          0.6%
  Cuts:         384 active / 387 generated
  LP solves:    512

Simulation complete (100 scenarios)
  Completed: 100  Failed: 0

Output written to my_first_study/results/

Exact numerical values vary across runs because scenario sampling is stochastic. The values below are representative of the 1dtoy example; your run will differ slightly.

LineWhat it means
Training complete in 3.2s (128 iterations, iteration_limit)Training ran for 128 iterations (the limit set in config.json) and stopped because the iteration limit was reached, not because a convergence criterion was met.
Lower bound: 142.3 $/stageThe optimizer’s best proven lower bound on the minimum expected cost per stage. As training progresses this value rises and stabilizes.
Upper bound: 143.1 +/- 1.2 $/stageA statistical estimate of the true expected cost, computed from the forward-pass scenarios in the final iteration. The +/- 1.2 is the standard deviation across those scenarios.
Gap: 0.6%The relative distance between the lower and upper bounds expressed as a percentage. A gap of 0.6% means the policy cost is within 0.6% of the best possible. Smaller is better.
Cuts: 384 active / 387 generatedThe total number of optimality cuts in the policy pool. 384 are currently active; 3 were deactivated by the cut selection strategy.
LP solves: 512Total number of linear programs solved across all stages and iterations.
Simulation complete (100 scenarios)The post-training simulation evaluated the trained policy over 100 independently sampled scenarios.
Completed: 100 Failed: 0All 100 scenarios completed without solver errors.
Output written to my_first_study/results/Root path of the output directory.

Lower bound vs. upper bound. The lower bound is the optimizer’s proven best estimate of the minimum achievable cost. The upper bound is the average cost observed when running the current policy over sampled scenarios. When the gap is small, the policy is near-optimal. When the gap is large, running more iterations will typically narrow it further.

Termination reasons. The parenthetical after the iteration count explains why training stopped:

  • iteration_limit — the maximum iteration count was reached (the 1dtoy default).
  • converged at iter N — a convergence criterion was met at iteration N and training stopped early. This appears when you configure a bound_stalling or similar rule in config.json.

Theory reference: For the mathematical definition of lower and upper bounds, optimality gap, and stopping criteria, see Convergence in the methodology reference.


Output Directory Structure

All artifacts are written under the results directory you specified with --output. The 1dtoy run produces:

my_first_study/results/
  training/
    _manifest.json          Completion manifest: status, iteration count, convergence, cut stats
    metadata.json           Run metadata: configuration snapshot, problem dimensions
    convergence.parquet     Per-iteration convergence metrics (lower bound, upper bound, gap)
    dictionaries/
      codes.json            Integer-to-string code mappings for entity categories
      state_dictionary.json State variable definitions and units
      entities.csv          Entity registry (id, name, type)
      variables.csv         LP variable registry
      bounds.parquet        LP variable bound definitions
    timing/
      iterations.parquet    Per-iteration wall-clock timing broken down by phase
  policy/
    cuts/
      stage_000.bin         FlatBuffers-encoded optimality cuts for stage 0
      stage_001.bin         ... stage 1
      stage_002.bin         ... stage 2
      stage_003.bin         ... stage 3
    basis/
      stage_000.bin         LP basis checkpoints for warm-starting
      stage_001.bin
      stage_002.bin
      stage_003.bin
    metadata.json           Policy metadata: stage count, cut counts per stage
  simulation/
    _manifest.json          Completion manifest: scenario counts
    buses/
      scenario_id=0000/data.parquet
      scenario_id=0001/data.parquet
      ...                   One partition per scenario
    costs/
      scenario_id=0000/data.parquet
      ...
    hydros/
      scenario_id=0000/data.parquet
      ...
    thermals/
      scenario_id=0000/data.parquet
      ...
    inflow_lags/            Inflow lag state data used to initialize scenario chains

The three top-level subdirectories have distinct roles:

  • training/ — everything produced during the training loop: convergence history, timing, and the dictionaries needed to interpret LP variable indices.
  • policy/ — the trained policy checkpoint. These binary files encode the optimality cuts built during training. They can be used to resume or extend a study.
  • simulation/ — the dispatch results from evaluating the trained policy over 100 simulation scenarios.

Training Results

Reading training/_manifest.json

The training manifest is the canonical summary of what happened during training. The 1dtoy run produces:

{
  "version": "2.0.0",
  "status": "complete",
  "started_at": null,
  "completed_at": null,
  "iterations": {
    "max_iterations": null,
    "completed": 128,
    "converged_at": null
  },
  "convergence": {
    "achieved": false,
    "final_gap_percent": 0.0,
    "termination_reason": "iteration_limit"
  },
  "cuts": {
    "total_generated": 387,
    "total_active": 384,
    "peak_active": 384
  },
  "checksum": null,
  "mpi_info": {
    "world_size": 1,
    "ranks_participated": 1
  }
}

Field-by-field explanation:

FieldMeaning
status"complete" when the training run finished normally. "failed" if a solver error aborted it.
iterations.completedNumber of training iterations that were executed.
iterations.converged_atIf training stopped early due to a convergence criterion, the iteration number where it stopped. null for an iteration-limit stop.
convergence.achievedtrue if a convergence stopping rule was satisfied, false if the iteration limit was reached first.
convergence.final_gap_percentThe gap between lower and upper bounds at the end of training, as a percentage. A value of 0.0 here reflects that the 1dtoy case converged very tightly within its 128-iteration budget.
convergence.termination_reasonMachine-readable reason for stopping. Common values: "iteration_limit", "bound_stalling".
cuts.total_generatedTotal optimality cuts created across all stages over the entire training run.
cuts.total_activeCuts still active in the pool at the end of training (not deactivated by the cut selection strategy).
cuts.peak_activeMaximum number of active cuts at any point during training.
mpi_info.world_sizeNumber of MPI ranks involved in the run. 1 for single-process runs.

What “converged” means in practice. A converged run (convergence.achieved: true) means a stopping rule determined that continuing would not meaningfully improve the policy. For the 1dtoy case, the gap reaches near zero within the 128-iteration budget even without an explicit convergence rule, which is why final_gap_percent is 0.0 despite achieved being false — the run hit its iteration limit at a point where the policy was already very tight.

For larger studies, configure a bound_stalling or gap_threshold stopping rule in config.json to stop automatically when the gap stabilizes, rather than running a fixed number of iterations.


Simulation Results

Hive-Partitioned Layout

The simulation output uses Hive partitioning: results are split into one data.parquet file per scenario, stored in a directory named scenario_id=NNNN/. This layout is natively understood by Polars, Pandas (via PyArrow), R’s arrow package, and DuckDB — they can read the entire simulation/costs/ directory as a single table and filter by scenario_id at the storage layer without loading all data into memory.

The four entity categories are:

DirectoryContents
buses/Power balance results: load, generation injections, deficit, and excess at each bus per stage and block.
hydros/Hydro dispatch: turbined flow, spillage, reservoir storage levels, inflows, and generation per plant per stage and block.
thermals/Thermal dispatch: generation output per unit per cost segment per stage and block.
costs/Objective cost breakdown: total cost, thermal cost, hydro cost, penalty cost, and discount factor per stage.

Results are in Parquet format. To read them, use any columnar data tool:

# Polars — reads all 100 scenarios at once
import polars as pl
df = pl.read_parquet("my_first_study/results/simulation/costs/")
print(df.head())
# Pandas + PyArrow
import pandas as pd
df = pd.read_parquet("my_first_study/results/simulation/costs/")
print(df.head())
-- DuckDB — filter to a single scenario
SELECT * FROM read_parquet('my_first_study/results/simulation/costs/**/*.parquet')
WHERE scenario_id = 0;
# R with arrow
library(arrow)
ds <- open_dataset("my_first_study/results/simulation/costs/")
dplyr::collect(dplyr::filter(ds, scenario_id == 0))

Querying Results with cobre report

cobre report reads the JSON manifests and prints a structured JSON summary to stdout. Use it with jq to extract specific metrics in scripts or CI pipelines.

# Print the full report
cobre report my_first_study/results

The output has this top-level shape:

{
  "output_directory": "/abs/path/to/results",
  "status": "complete",
  "training": { "iterations": {}, "convergence": {}, "cuts": {} },
  "simulation": { "scenarios": {} },
  "metadata": { "run_info": {}, "configuration_snapshot": {} }
}

Practical jq queries

# Extract the final convergence gap
cobre report my_first_study/results | jq '.training.convergence.final_gap_percent'

# Check how many iterations ran
cobre report my_first_study/results | jq '.training.iterations.completed'

# Check simulation scenario counts
cobre report my_first_study/results | jq '.simulation.scenarios'

# Use the status in a CI script: exit non-zero if training failed
status=$(cobre report my_first_study/results | jq -r '.status')
if [ "$status" != "complete" ]; then
  echo "Run did not complete successfully: $status" >&2
  exit 1
fi

# Check convergence was achieved (returns true or false)
cobre report my_first_study/results | jq '.training.convergence.achieved'

For the complete cobre report documentation and all available JSON fields, see CLI Reference.

For a detailed description of every field in every output file, see Output Format Reference.


What’s Next

You have now seen how to run a study and interpret its output. The next page collects pointers to everything you need to go further:

Next Steps

You have completed the Cobre tutorial: you installed the tool, ran a complete study with the 1dtoy template, inspected the case files, and interpreted the output. This page points you to the resources that go deeper.


Configuration

The config.json file controls every aspect of how training and simulation are run: stopping rules, forward pass counts, simulation scenario counts, and more. The configuration guide documents every field with examples.


System Modeling

The tutorial uses a minimal single-bus, one-hydro, two-thermal system. Real studies model transmission networks, cascaded hydro plants, and many thermal units. The system modeling guides explain every entity type and its parameters.


CLI Reference

All subcommands (run, validate, report, version), their flags, exit codes, and environment variables are documented in the CLI reference.


Methodology and Theory

The methodology reference describes the mathematical foundations of the solver: the stochastic optimization formulation, the PAR(p) scenario model, the cut management strategy, and the convergence theory. This is the right place to start if you want to understand what the solver is doing, not just how to use it.


API Documentation

The Rust API for all workspace crates is generated from inline doc comments. To build and open it locally:

cargo doc --workspace --no-deps --open

This opens the documentation for all 10 workspace crates in your browser. The cobre-core crate documents the entity model; cobre-sddp documents the training loop and cut management types.


Contributing

If you find a bug, want to add a feature, or want to improve the documentation, the contributing guide explains how to set up the development environment, run the test suite, and submit a pull request.

  • Contributing — development setup, branch conventions, and PR process

Community and Support

Getting Started

Cobre is a power system analysis toolkit built around a production-grade SDDP solver for long-term hydrothermal dispatch planning. It reads a self-contained case directory of JSON and Parquet input files, trains a stochastic dispatch policy, simulates that policy over independent scenarios, and writes Hive-partitioned Parquet output ready for analysis.

This section of the User Guide is the reference path through the software. If you prefer a hands-on walkthrough starting from a working example, the Tutorial is the better starting point.


What You Need

No Rust toolchain or C compiler required. The pre-built binary is statically linked and runs on the following platforms out of the box:

PlatformTarget Triple
macOS (Apple Silicon)aarch64-apple-darwin
macOS (Intel)x86_64-apple-darwin
Linux (x86-64)x86_64-unknown-linux-gnu
Linux (ARM64)aarch64-unknown-linux-gnu
Windows (x86-64)x86_64-pc-windows-msvc

To build from source

DependencyMinimum VersionNotes
Rust toolchain1.85 (stable)Install via rustup
C compilerGCC or ClangRequired for the HiGHS LP solver
CMake3.15Required for the HiGHS build system

Next Steps

Install Cobre

Installation covers all three installation methods: pre-built binary (the fastest path), cargo install from crates.io, and building from source for contributors or unsupported platforms.

Run Your First Study

Your First Study walks through the end-to-end workflow for running a study on a case directory you already have: validate inputs, run the solver, and inspect results with cobre report.


For Hands-On Learners

The Tutorial section provides a step-by-step learning path that starts by installing Cobre, then scaffolds a complete example study using the built-in 1dtoy template, explains the anatomy of a case directory, and shows how to read the output files.

If you have not used Cobre before, starting with the Tutorial and returning to the User Guide as a reference is the recommended approach.


Installation

Cobre v0.1.0 is a statically linked binary available for five platforms. Choose the method that best fits your environment.


Fastest for end users. No Rust toolchain or C compiler required.

Linux and macOS

curl --proto '=https' --tlsv1.2 -LsSf https://github.com/cobre-rs/cobre/releases/latest/download/cobre-cli-installer.sh | sh

The installer places the cobre binary in $CARGO_HOME/bin (typically ~/.cargo/bin). Add that directory to your PATH if it is not already present.

Windows (PowerShell)

powershell -ExecutionPolicy Bypass -c "irm https://github.com/cobre-rs/cobre/releases/latest/download/cobre-cli-installer.ps1 | iex"

Supported Platforms

PlatformTarget Triple
macOS (Apple Silicon)aarch64-apple-darwin
macOS (Intel)x86_64-apple-darwin
Linux (x86-64)x86_64-unknown-linux-gnu
Linux (ARM64)aarch64-unknown-linux-gnu
Windows (x86-64)x86_64-pc-windows-msvc

You can also download individual archives directly from the GitHub Releases page.

Verify the Installation

cobre version

Expected output (exact versions and arch will vary):

cobre   v0.1.0
solver: HiGHS
comm:   local
zstd:   enabled
arch:   x86_64-linux
build:  release (lto=thin)

From crates.io

cargo install cobre-cli

Requires Rust 1.85+ and build prerequisites (see Build from Source below). Installs to $CARGO_HOME/bin.


Build from Source

For contributors or unsupported platforms.

Prerequisites

DependencyMinimum VersionNotes
Rust toolchain1.85 (stable)Install via rustup
C compilerany recent GCC or ClangRequired for the HiGHS LP solver
CMake3.15Required for the HiGHS build system
GitanyRequired for submodule initialization

Steps

# Clone the repository
git clone https://github.com/cobre-rs/cobre.git
cd cobre

# Initialize HiGHS submodule (required for the solver backend)
git submodule update --init --recursive

# Build the release binary
cargo build --release -p cobre-cli

The binary is written to target/release/cobre. Optionally install to $CARGO_HOME/bin:

cargo install --path crates/cobre-cli

Verify:

./target/release/cobre version
cargo test --workspace --all-features

Next Steps

Your First Study

This page walks through running a complete SDDP study with Cobre from a case directory you already have. It covers the three-step workflow — validate, run, report — explains what each output file contains at a high level, and points you to the right reference pages for deeper analysis.

If you do not yet have a case directory, the Quickstart tutorial shows how to scaffold one from a built-in template using cobre init --template 1dtoy.


Prepare Your Case Directory

A case directory is a self-contained folder that holds all input files for a single power system study. The minimum required structure is:

my_study/
  config.json
  penalties.json
  stages.json
  initial_conditions.json
  system/
    buses.json
    hydros.json
    thermals.json
    lines.json

config.json controls the training algorithm (number of forward passes, stopping rules) and simulation settings. stages.json defines the planning horizon, policy graph type, and time blocks. The system/ files define the physical elements of the power system. scenarios/ holds optional Parquet files with PAR(p) statistics for stochastic inflow and load generation.

For the complete schema of every file, see Case Directory Format.


Validate the Inputs

Before running the solver, validate the case directory to catch input errors early:

cobre validate /path/to/my_study

The validation pipeline runs five layers in sequence: schema correctness, cross-reference consistency (e.g., every plant references a valid bus), physical feasibility (e.g., capacity bounds are non-negative), stochastic consistency (e.g., PAR(p) statistics are well-defined), and solver feasibility (e.g., the LP is bounded). Each layer must pass before the next runs.

On success, Cobre prints entity counts and exits with code 0:

Valid case: 3 buses, 12 hydros, 8 thermals, 4 lines
  buses: 3
  hydros: 12
  thermals: 8
  lines: 4

On failure, each error is printed with an error: prefix and Cobre exits with code 1. Fix all reported errors before proceeding — cobre run runs the same validation pipeline and will exit with code 1 on any validation failure.


Run the Study

cobre run /path/to/my_study

By default, results are written to <CASE_DIR>/output/. To specify a different location:

cobre run /path/to/my_study --output /path/to/results

The run proceeds through four lifecycle stages:

  1. Load — reads all input files and runs the validation pipeline.
  2. Train — iterates the SDDP forward/backward pass loop until the configured stopping rules are satisfied (gap threshold, iteration limit, or bound stalling).
  3. Simulate — evaluates the trained policy over independent out-of-sample scenarios. Skip this stage with --skip-simulation.
  4. Write — writes all output files: Hive-partitioned Parquet for tabular results, JSON manifests, and a FlatBuffers policy checkpoint.

When stdout is a terminal, a progress bar tracks training iterations. A post-run summary is printed to stderr when all stages complete:

Training complete in 12.4s (128 iterations, converged at iter 94)
  Lower bound:  3812.6 $/stage
  Upper bound:  3836.1 +/- 14.2 $/stage
  Gap:          0.6%
  Cuts:         94 active / 94 generated
  LP solves:    4992

Simulation complete (100 scenarios)
  Completed: 100  Failed: 0

Output written to /path/to/my_study/output/

Use --quiet to suppress the banner and progress output in batch scripts, or --verbose to enable debug-level logging when diagnosing solver issues.


Inspect the Results

Use cobre report to get a machine-readable summary of a completed run without loading any Parquet files:

cobre report /path/to/my_study/output

cobre report reads the JSON manifest files written by cobre run and prints a JSON summary to stdout:

{
  "output_directory": "/path/to/my_study/output",
  "status": "complete",
  "training": { "iterations": {}, "convergence": {}, "cuts": {} },
  "simulation": { "scenarios": {} },
  "metadata": { "run_info": {}, "configuration_snapshot": {} }
}

This is suitable for piping to jq for scripted checks:

# Extract the final convergence gap
cobre report /path/to/my_study/output | jq '.training.convergence.final_gap_percent'

# Check whether convergence was achieved
cobre report /path/to/my_study/output | jq '.training.convergence.achieved'

Output file layout

The results directory contains:

output/
  policy/
    cuts/
      stage_000.bin  ...  stage_NNN.bin   # FlatBuffers Benders cuts
    basis/
      stage_000.bin  ...  stage_NNN.bin   # LP warm-start bases
    metadata.json
  training/
    _manifest.json                        # Convergence summary
    convergence.parquet                   # Per-iteration bounds and gap
  simulation/
    _manifest.json                        # Simulation summary
    costs/                                # Stage costs per scenario
    hydros/                               # Hydro dispatch per scenario
    thermals/                             # Thermal dispatch per scenario
    buses/                                # Bus balance per scenario

The key metric to check first is the convergence gap in training/_manifest.json. A gap below 1% is typically very good; 1-5% is acceptable for long-horizon planning; above 5% warrants investigation (consider increasing the iteration limit or forward pass count in config.json).

For a detailed walkthrough of every output file and how to load and analyze the Parquet data in Python or R, see Interpreting Results.


Next Steps

System Modeling

A Cobre case describes a power system as a collection of entities. Each entity represents a physical component — a bus, a generator, a transmission line — or a contractual obligation. Together, they form the complete model that the solver turns into a sequence of LP sub-problems, one per stage per scenario trajectory.

The fundamental organizing principle is simple: every generator and every load connects to a bus. A bus is an electrical node at which the power balance constraint must hold. At each stage and each load block, the LP enforces that the total power injected into a bus equals the total power withdrawn from it. When the constraint cannot be satisfied by physical generation alone, deficit slack variables absorb the gap at a penalty cost, ensuring the LP always has a feasible solution.

Entities are grouped by type and stored in a System object. The System is built from the case directory by load_case, which runs a five-layer validation pipeline before handing the model to the solver. Within the System, all entity collections are kept in canonical ID-sorted order. This ordering is an invariant: it guarantees that simulation results are bit-for-bit identical regardless of the order entities appear in the input files.


Entity Types

Cobre models seven entity types. Four are fully implemented and contribute LP variables and constraints. Three are registered stubs that appear in the entity model but do not yet contribute LP variables in the current release.

Entity TypeStatusJSON FileDescription
BusFullsystem/buses.jsonElectrical node. Power balance constraint per stage per block. See Network Topology.
LineFullsystem/lines.jsonTransmission interconnection between two buses with flow limits and losses. See Network Topology.
HydroFullsystem/hydros.jsonReservoir-turbine-spillway system with cascade linkage. See Hydro Plants.
ThermalFullsystem/thermals.jsonDispatchable generator with piecewise-linear cost curve. See Thermal Units.
ContractStubsystem/contracts.jsonEnergy purchase or sale obligation. Entity exists in registry; no LP variables in this release.
Pumping StationStubsystem/pumping_stations.jsonPumped-storage or water-transfer station. Entity exists in registry; no LP variables in this release.
Non-ControllableStubsystem/non_controllable.jsonVariable renewable source (wind, solar, run-of-river). Entity exists in registry; no LP variables in this release.

The three stub types are registered in the entity model from Phase 1 so that LP construction code can iterate over all seven types consistently. Adding LP contributions for stub entities is planned for future releases.


How Entities Connect

The network is bus-centric. Every entity that produces or consumes power is attached to a bus via a bus_id field:

   Hydro ──┐
           │ inject
  Thermal ─┤
           ├──> Bus <──── Line ────> Bus
  NCS ─────┘
                │
               load
                │
           Contract
         Pumping Station

At each stage and load block, the LP enforces the bus balance constraint:

  sum(generation at bus) + sum(imports from lines) + deficit
    = load_demand + sum(exports to lines) + excess

Deficit and excess slack variables absorb imbalance at a penalty cost, ensuring the LP is always feasible. When the deficit penalty is high enough relative to the cost of available generation, the solver will prefer to generate rather than incur deficit.

Cascade topology governs hydro plant interactions. A hydro plant with a non-null downstream_id sends all of its outflow — turbined flow plus spillage — into the downstream plant’s reservoir at the same stage. The cascade forms a directed forest: multiple upstream plants may flow into a single downstream plant, but no cycles are allowed. Water balance is computed in topological order — upstream plants first, downstream plants last — in a single pass per stage.


Declaration-Order Invariance

The order in which entities appear in the JSON input files does not affect results. Cobre reads all entities from their files, then sorts each collection by entity ID before building the System. Every function that processes entity collections operates on this canonical sorted order.

This invariant has a practical consequence: you can rearrange entries in buses.json, hydros.json, or any other entity file without changing the simulation output. You can also add new entities with lower IDs than existing ones without disturbing results for the existing entities.


Penalties and Soft Constraints

LP solvers require feasible problems. Physical constraints — minimum outflow, minimum turbined flow, reservoir bounds — can become infeasible under extreme stochastic scenarios (very low inflow, very high load). Cobre handles this by making nearly every physical constraint soft: instead of a hard infeasibility, the solver pays a penalty cost to violate the constraint by a small amount.

Penalties are set at three levels, resolved from most specific to most general:

  1. Stage-level override — penalty files for individual stages, when present
  2. Entity-level override — a penalties block inside the entity’s JSON object
  3. Global default — the top-level penalties.json file in the case directory

This three-tier cascade gives you precise control: you can set a strict global spillage penalty and then relax it for a specific plant that is known to spill frequently in wet years. For details on the penalty fields for each entity type, see the Configuration guide and the Case Format Reference.

The bus deficit segments are the most important penalty to configure correctly. A deficit cost that is too low makes the solver prefer deficit over building generation capacity; a cost that is too high (or an unbounded segment that is absent) can cause numerical instability. The final deficit segment must always have depth_mw: null (unbounded) to guarantee LP feasibility.


Entity Lifecycle

Entities can enter service or be decommissioned at specified stages using entry_stage_id and exit_stage_id fields:

FieldTypeMeaning
entry_stage_idinteger or nullStage index at which the entity enters service (inclusive). null = available from stage 0
exit_stage_idinteger or nullStage index at which the entity is decommissioned (inclusive). null = never decommissioned

These fields are available on Hydro, Thermal, and Line entities. When a plant has entry_stage_id: 12, the LP does not include any variables for that plant in stages 0 through 11. From stage 12 onward, the plant appears in every sub-problem as normal.

Lifecycle fields are useful for planning studies that span commissioning or retirement events: new thermal plants coming online mid-horizon, or aging hydro units being decommissioned. Each lifecycle event is validated to ensure that entry_stage_id falls within the stage range defined in stages.json.


Hydro Plants

Hydroelectric power plants are the central dispatchable resource in Cobre’s system model. Unlike thermal units, which simply convert fuel into electricity at a cost, hydro plants manage a reservoir — a state variable that persists between stages and couples the dispatch decisions of today to the feasibility of tomorrow. This intertemporal coupling is precisely why hydrothermal scheduling requires stochastic dynamic programming rather than a simple merit-order dispatch.

A hydro plant in Cobre is composed of three physical components: a reservoir that stores water between stages, a turbine that converts water flow into electrical generation, and a spillway that releases excess water without producing power. Each stage’s LP sub-problem contains one water balance constraint per plant: inflow plus beginning storage equals turbined flow plus spillage plus ending storage. The solver decides how much to turbine and how much to store, trading off present-stage generation against future-stage optionality.

Plants can be linked into a cascade via the downstream_id field. When plant A has downstream_id pointing to plant B, all water released from A (turbined flow plus spillage) enters B’s reservoir at the same stage. Cascade topology is validated to be acyclic — no chain of downstream references may loop back to an earlier plant.

For a step-by-step introduction to writing hydros.json, see Building a System and Anatomy of a Case. This page provides the complete field reference with all optional fields documented.

Theory reference: For the mathematical formulation of hydro modeling and the SDDP algorithm that drives dispatch decisions, see SDDP Theory in the methodology reference.


JSON Schema

Hydro plants are defined in system/hydros.json. The top-level object has a single key "hydros" containing an array of plant objects. The following example shows all fields — required and optional — for a single plant:

{
  "hydros": [
    {
      "id": 1,
      "name": "UHE Tucuruí",
      "bus_id": 0,
      "downstream_id": null,
      "entry_stage_id": null,
      "exit_stage_id": null,
      "reservoir": {
        "min_storage_hm3": 50.0,
        "max_storage_hm3": 45000.0
      },
      "outflow": {
        "min_outflow_m3s": 1000.0,
        "max_outflow_m3s": 100000.0
      },
      "generation": {
        "model": "constant_productivity",
        "productivity_mw_per_m3s": 0.8765,
        "min_turbined_m3s": 500.0,
        "max_turbined_m3s": 22500.0,
        "min_generation_mw": 0.0,
        "max_generation_mw": 8370.0
      },
      "tailrace": {
        "type": "polynomial",
        "coefficients": [5.0, 0.001]
      },
      "hydraulic_losses": {
        "type": "factor",
        "value": 0.03
      },
      "efficiency": {
        "type": "constant",
        "value": 0.93
      },
      "evaporation_coefficients_mm": [
        80.0, 75.0, 70.0, 65.0, 60.0, 55.0, 60.0, 65.0, 70.0, 75.0, 80.0, 85.0
      ],
      "diversion": {
        "downstream_id": 2,
        "max_flow_m3s": 200.0
      },
      "filling": {
        "start_stage_id": 48,
        "filling_inflow_m3s": 100.0
      },
      "penalties": {
        "spillage_cost": 0.01,
        "diversion_cost": 0.1,
        "fpha_turbined_cost": 0.05,
        "storage_violation_below_cost": 10000.0,
        "filling_target_violation_cost": 50000.0,
        "turbined_violation_below_cost": 500.0,
        "outflow_violation_below_cost": 500.0,
        "outflow_violation_above_cost": 500.0,
        "generation_violation_below_cost": 1000.0,
        "evaporation_violation_cost": 5000.0,
        "water_withdrawal_violation_cost": 1000.0
      }
    }
  ]
}

The 1dtoy template uses a minimal hydro definition that omits all optional fields. Only id, name, bus_id, downstream_id, reservoir, outflow, and generation are required. All other top-level keys (tailrace, hydraulic_losses, efficiency, evaporation_coefficients_mm, diversion, filling, penalties) are optional and default to off when absent.


Core Fields

These fields appear at the top level of each hydro plant object.

FieldTypeRequiredDescription
idintegerYesUnique non-negative integer identifier. Must be unique across all hydro plants. Referenced by initial_conditions.json and by other plants via downstream_id.
namestringYesHuman-readable plant name. Used in output files, validation messages, and log output.
bus_idintegerYesIdentifier of the electrical bus to which this plant’s generation is injected. Must match an id in buses.json.
downstream_idinteger or nullYesIdentifier of the plant that receives this plant’s outflow. null means the plant is at the bottom of its cascade — outflow leaves the system.
entry_stage_idinteger or nullNoStage index at which the plant enters service (inclusive). null means the plant is available from stage 0.
exit_stage_idinteger or nullNoStage index at which the plant is decommissioned (inclusive). null means the plant is never decommissioned.

Reservoir

The reservoir block defines the operational storage bounds for the plant. Storage is tracked in hm³ (cubic hectometres; 1 hm³ = 10⁶ m³). The beginning-of-stage storage is the state variable that links consecutive stages in the LP.

"reservoir": {
  "min_storage_hm3": 0.0,
  "max_storage_hm3": 1000.0
}
FieldTypeDescription
min_storage_hm3numberMinimum operational storage (dead volume). Water below this level cannot reach the turbine intakes. For plants that can empty completely, use 0.0.
max_storage_hm3numberMaximum operational storage (flood control level). When the reservoir reaches this level, all excess inflow must be spilled. Must be strictly greater than min_storage_hm3.

Setting min_storage_hm3 to the dead volume of your reservoir is important for correctly computing the usable storage range. A reservoir with 500 hm³ total physical capacity but 100 hm³ below the turbine intakes should be modeled as min_storage_hm3: 100.0, max_storage_hm3: 500.0.


Outflow Constraints

The outflow block constrains total outflow from the plant. Total outflow equals turbined flow plus spillage. These constraints are enforced by soft penalties when they cannot be satisfied due to extreme scenario conditions.

"outflow": {
  "min_outflow_m3s": 0.0,
  "max_outflow_m3s": 50.0
}
FieldTypeDescription
min_outflow_m3snumberMinimum total outflow required at all times [m³/s]. Set to the ecological flow requirement or minimum riparian right. Use 0.0 if there is no minimum requirement.
max_outflow_m3snumber or nullMaximum total outflow [m³/s]. Models the physical capacity of the river channel below the dam. null means no upper bound on outflow.

Minimum outflow is a hard lower bound on the sum of turbined flow and spillage. When the solver cannot meet this bound (for example, because the reservoir is nearly empty and inflow is very low), a violation slack variable is added to the LP at the cost specified by outflow_violation_below_cost in the penalties block.


Generation Models

The generation block configures the turbine model (internally stored as the generation_model field on the Hydro struct). All variants share the core turbine bounds (min_turbined_m3s, max_turbined_m3s) and generation bounds (min_generation_mw, max_generation_mw). The model key selects which production function converts flow to power.

"generation": {
  "model": "constant_productivity",
  "productivity_mw_per_m3s": 1.0,
  "min_turbined_m3s": 0.0,
  "max_turbined_m3s": 50.0,
  "min_generation_mw": 0.0,
  "max_generation_mw": 50.0
}
FieldTypeDescription
modelstringProduction function variant. See the model table below.
productivity_mw_per_m3snumberPower output per unit of turbined flow [MW/(m³/s)]. Used by constant_productivity and linearized_head.
min_turbined_m3snumberMinimum turbined flow [m³/s]. Non-zero values model a minimum stable turbine operation.
max_turbined_m3snumberMaximum turbined flow (installed turbine capacity) [m³/s].
min_generation_mwnumberMinimum electrical generation [MW].
max_generation_mwnumberMaximum electrical generation (installed capacity) [MW].

Available Production Function Models

Modelmodel valueStatusDescription
Constant productivity"constant_productivity"Availablepower = productivity * turbined_flow. Independent of reservoir head. The only model supported in the current release.
Linearized head"linearized_head"Not yet availableHead-dependent productivity linearized around an operating point at each stage. Will be documented when released.
FPHA"fpha"Not yet availableFull production function with head-area-productivity tables. Requires forebay and tailrace elevation tables. Will be documented when released.

For the 1dtoy example and for most initial studies, constant_productivity is the correct choice. The productivity_mw_per_m3s factor encodes the plant’s average efficiency and net head. For a plant with 80 m net head and 90% efficiency, the theoretical productivity is approximately 9.81 * 80 * 0.90 / 1000 ≈ 0.706 MW/(m³/s).


Cascade Topology

The downstream_id field creates a directed chain of hydro plants. Water released from an upstream plant — whether turbined or spilled — enters the downstream plant’s reservoir in the same stage.

To model a three-plant cascade where plant 0 flows into plant 1, which flows into plant 2:

{ "id": 0, "downstream_id": 1, ... }
{ "id": 1, "downstream_id": 2, ... }
{ "id": 2, "downstream_id": null, ... }

Cobre validates that the downstream graph is acyclic: no chain of downstream_id references may return to a plant already in the chain. A cycle would make the water balance equation unsolvable. The validator reports the cycle as a topology error with the full chain of plant IDs.

Plants with downstream_id: null are tailwater plants — their outflow leaves the basin. Each connected component of the cascade graph must have exactly one tailwater plant (the chain’s end node). A cascade component with no tailwater plant would be a cycle, which the validator rejects.


Advanced Fields

The following fields enable higher-fidelity physical modeling. They are all optional. For most system planning studies, these fields can be omitted; they become relevant when calibrating a model against historical dispatch data or when the head variation at a plant is significant.

Tailrace Model

The tailrace block models the downstream water level as a function of total outflow. The tailrace elevation affects the net hydraulic head and is used by the linearized_head and fpha generation models. When absent, tailrace elevation is treated as zero.

Two variants are supported:

Polynomialheight = a₀ + a₁·Q + a₂·Q² + …

"tailrace": {
  "type": "polynomial",
  "coefficients": [5.0, 0.001]
}

coefficients is an array of polynomial coefficients in ascending power order. coefficients[0] is the constant term (height at zero outflow in metres), coefficients[1] is the coefficient for Q¹, and so on.

Piecewise — linearly interpolated between (outflow, height) breakpoints.

"tailrace": {
  "type": "piecewise",
  "points": [
    { "outflow_m3s": 0.0, "height_m": 3.0 },
    { "outflow_m3s": 5000.0, "height_m": 4.5 },
    { "outflow_m3s": 15000.0, "height_m": 6.2 }
  ]
}

Points must be sorted in ascending outflow_m3s order. The solver interpolates linearly between adjacent points.

Hydraulic Losses

The hydraulic_losses block models head loss in the penstock and draft tube. Hydraulic losses reduce the effective head available at the turbine. When absent, the penstock is modeled as lossless.

Factor — loss as a fraction of net head:

"hydraulic_losses": { "type": "factor", "value": 0.03 }

value is a dimensionless fraction (e.g., 0.03 = 3% of net head).

Constant — fixed head loss regardless of flow:

"hydraulic_losses": { "type": "constant", "value_m": 2.5 }

value_m is the fixed head loss in metres.

Efficiency Model

The efficiency block scales the power output from the hydraulic power available. When absent, 100% efficiency is assumed.

Currently only the "constant" variant is supported:

"efficiency": { "type": "constant", "value": 0.93 }

value is a dimensionless fraction in the range (0, 1]. A value of 0.93 means the turbine converts 93% of available hydraulic power to electrical output.

Evaporation Coefficients

The evaporation_coefficients_mm field models water loss from the reservoir surface due to evaporation. When present, it must be an array of exactly 12 values, one per calendar month:

"evaporation_coefficients_mm": [
  80.0, 75.0, 70.0, 65.0, 60.0, 55.0,
  60.0, 65.0, 70.0, 75.0, 80.0, 85.0
]

Index 0 is January, index 11 is December. Values are in mm/month. The evaporated volume is computed from the surface area of the reservoir at each stage. When absent, no evaporation is modeled.

Diversion Channel

The diversion block models a water diversion channel that routes flow directly from this plant’s reservoir to a downstream plant’s reservoir, bypassing turbines and spillways. When absent, no diversion is modeled.

"diversion": {
  "downstream_id": 2,
  "max_flow_m3s": 200.0
}
FieldDescription
downstream_idIdentifier of the plant whose reservoir receives the diverted flow.
max_flow_m3sMaximum diversion flow capacity [m³/s].

Filling Configuration

The filling block enables a filling operation mode, where the reservoir is intentionally filled from an external, fixed inflow source (such as a diversion works from an unrelated basin) during a defined stage window. When absent, no filling operation is active.

"filling": {
  "start_stage_id": 48,
  "filling_inflow_m3s": 100.0
}
FieldDescription
start_stage_idStage index at which filling begins (inclusive).
filling_inflow_m3sConstant inflow applied to the reservoir during filling [m³/s].

Penalties

The penalties block inside a hydro plant definition overrides the global defaults from penalties.json for that specific plant. When the block is absent, all penalty values fall back to the global defaults. When it is present, it must contain all 11 fields.

Penalty costs are added to the LP objective when soft constraint violations occur. They do not represent physical costs — they are optimization weights that guide the solver to avoid infeasible or undesirable operating states.

"penalties": {
  "spillage_cost": 0.01,
  "diversion_cost": 0.1,
  "fpha_turbined_cost": 0.05,
  "storage_violation_below_cost": 10000.0,
  "filling_target_violation_cost": 50000.0,
  "turbined_violation_below_cost": 500.0,
  "outflow_violation_below_cost": 500.0,
  "outflow_violation_above_cost": 500.0,
  "generation_violation_below_cost": 1000.0,
  "evaporation_violation_cost": 5000.0,
  "water_withdrawal_violation_cost": 1000.0
}
FieldUnitDescription
spillage_cost$/m³/sPenalty per m³/s of water spilled. Setting this low (e.g., 0.01) makes spillage the least-cost way to relieve a flood situation. Setting it high penalizes wasted water in water-scarce scenarios.
diversion_cost$/m³/sPenalty per m³/s of diverted flow exceeding the diversion channel capacity.
fpha_turbined_cost$/MWhPenalty per MWh of turbined generation in the FPHA approximation. Not used by constant_productivity.
storage_violation_below_cost$/hm³Penalty per hm³ of storage below min_storage_hm3. Should be set high (thousands) to make violations a last resort.
filling_target_violation_cost$/hm³Penalty per hm³ of storage below the filling target. Only active when a filling block is present.
turbined_violation_below_cost$/m³/sPenalty per m³/s of turbined flow below min_turbined_m3s.
outflow_violation_below_cost$/m³/sPenalty per m³/s of total outflow below min_outflow_m3s. Set high to enforce ecological flow requirements.
outflow_violation_above_cost$/m³/sPenalty per m³/s of total outflow above max_outflow_m3s. Set high to enforce flood channel capacity limits.
generation_violation_below_cost$/MWPenalty per MW of generation below min_generation_mw.
evaporation_violation_cost$/mmPenalty per mm of evaporation constraint violation. Only active when evaporation_coefficients_mm is present.
water_withdrawal_violation_cost$/m³/sPenalty per m³/s of water withdrawal constraint violation.

Three-Tier Resolution Cascade

Penalty values are resolved from the most specific to the most general source:

  1. Stage-level override (defined in stage-specific penalty files, when present)
  2. Entity-level override (the penalties block inside the plant’s JSON object)
  3. Global default (the hydro section of penalties.json)

The penalties block on a plant replaces the global default for that plant alone. All plants that do not have a penalties block use the global values from penalties.json. The global penalties.json file must always be present and must contain all 11 hydro penalty fields.


Validation Rules

Cobre’s five-layer validation pipeline checks the following conditions on hydro plants. Violations are reported as error messages with the failing plant’s id and the nature of the problem.

RuleError ClassDescription
Bus reference integrityReference errorEvery bus_id must match an id in buses.json.
Downstream reference integrityReference errorEvery non-null downstream_id must match an id in hydros.json.
Cascade acyclicityTopology errorThe directed graph of downstream_id links must be acyclic.
Storage bounds orderingPhysical feasibilitymin_storage_hm3 must be less than max_storage_hm3.
Outflow bounds orderingPhysical feasibilityWhen max_outflow_m3s is present, it must be greater than or equal to min_outflow_m3s.
Turbine bounds orderingPhysical feasibilitymin_turbined_m3s must be less than or equal to max_turbined_m3s.
Generation bounds consistencyPhysical feasibilitymin_generation_mw must be less than or equal to max_generation_mw.
Initial conditions completenessReference errorEvery hydro plant must have exactly one entry in initial_conditions.json (either in storage or filling_storage, not both).
Evaporation array lengthSchema errorWhen evaporation_coefficients_mm is present, it must have exactly 12 values.

Thermal Units

Thermal power plants are the dispatchable generation assets that complement hydro in Cobre’s system model. The term “thermal” covers any generator whose output is bounded by installed capacity and whose dispatch incurs an explicit cost per MWh: combustion turbines, combined-cycle plants, coal-fired units, nuclear plants, and diesel generators all map onto the same Cobre Thermal entity type.

Unlike hydro plants, thermal units carry no state between stages. Each stage’s LP sub-problem treats a thermal unit as a simple bounded generation variable with a marginal cost. The solver dispatches thermal units in merit order — from cheapest to most expensive — to meet any residual demand not covered by hydro generation. In a hydrothermal system, the long-run value of stored water is compared against the short-run cost of thermal dispatch at each stage, which is the fundamental trade-off the SDDP algorithm optimizes.

The cost structure of a thermal unit is modeled with a piecewise-linear cost curve (cost_segments). A single-segment plant dispatches all its capacity at a flat cost. A multi-segment plant has increasing marginal costs at higher output levels, reflecting the physical reality that a plant becomes less fuel-efficient as it approaches its rated capacity.

For an introductory walkthrough of writing thermals.json, see Building a System and Anatomy of a Case. This page provides the complete field reference, including multi-segment cost curves and GNL configuration.


JSON Schema

Thermal units are defined in system/thermals.json. The top-level object has a single key "thermals" containing an array of unit objects. The following example shows all fields for a two-segment plant with GNL configuration:

{
  "thermals": [
    {
      "id": 0,
      "name": "UTE1",
      "bus_id": 0,
      "cost_segments": [
        {
          "capacity_mw": 15.0,
          "cost_per_mwh": 5.0
        }
      ],
      "generation": {
        "min_mw": 0.0,
        "max_mw": 15.0
      }
    },
    {
      "id": 1,
      "name": "Angra 1",
      "bus_id": 0,
      "entry_stage_id": null,
      "exit_stage_id": null,
      "cost_segments": [
        {
          "capacity_mw": 300.0,
          "cost_per_mwh": 50.0
        },
        {
          "capacity_mw": 357.0,
          "cost_per_mwh": 80.0
        }
      ],
      "generation": {
        "min_mw": 0.0,
        "max_mw": 657.0
      },
      "gnl_config": {
        "lag_stages": 2
      }
    }
  ]
}

The first plant (UTE1) matches the 1dtoy template format: a single cost segment with no optional fields. The second plant (Angra 1) shows the complete schema with a two-segment cost curve and GNL dispatch anticipation. The fields entry_stage_id, exit_stage_id, and gnl_config are optional and can be omitted.


Core Fields

These fields appear at the top level of each thermal unit object.

FieldTypeRequiredDescription
idintegerYesUnique non-negative integer identifier. Must be unique across all thermal units.
namestringYesHuman-readable plant name. Used in output files, validation messages, and log output.
bus_idintegerYesIdentifier of the electrical bus to which this unit’s generation is injected. Must match an id in buses.json.
entry_stage_idinteger or nullNoStage index at which the unit enters service (inclusive). null means the unit is available from stage 0.
exit_stage_idinteger or nullNoStage index at which the unit is decommissioned (inclusive). null means the unit is never decommissioned.

Generation Bounds

The generation block sets the output limits for the unit (stored internally as min_generation_mw and max_generation_mw on the Thermal struct). These are enforced as hard bounds on the generation variable in each stage LP.

"generation": {
  "min_mw": 0.0,
  "max_mw": 657.0
}
FieldTypeDescription
min_mwnumberMinimum electrical generation (minimum stable load) [MW]. A non-zero value represents a must-run commitment: the solver is required to dispatch at least this much generation whenever the unit is in service.
max_mwnumberMaximum electrical generation (installed capacity) [MW]. This must equal the sum of all capacity_mw values in cost_segments.

A min_mw of 0.0 means the unit can be turned off completely — it is treated as an interruptible resource. A non-zero min_mw (for example, 100.0 for a plant whose turbine must spin continuously for mechanical reasons) means the LP must always dispatch at least that amount whenever the plant is active.

The max_mw field caps total generation and must equal the sum of all segment capacities in cost_segments. The validator checks this constraint and reports an error if the values do not match.


Cost Segments

The cost_segments array defines the piecewise-linear generation cost curve. Each segment represents a range of generation capacity and its associated marginal cost. Segments are applied in order: the first capacity_mw MW of output uses the first segment’s cost, the next capacity_mw MW uses the second segment’s cost, and so on.

"cost_segments": [
  {
    "capacity_mw": 300.0,
    "cost_per_mwh": 50.0
  },
  {
    "capacity_mw": 357.0,
    "cost_per_mwh": 80.0
  }
]
FieldTypeDescription
capacity_mwnumberGeneration capacity of this segment [MW]. Must be positive.
cost_per_mwhnumberMarginal cost in this segment [$/MWh].

Single-Segment Plants

Most thermal units in planning studies use a single cost segment, which treats the entire capacity as available at a uniform marginal cost:

"cost_segments": [
  { "capacity_mw": 15.0, "cost_per_mwh": 5.0 }
]

The LP will dispatch this plant at any level between min_mw and max_mw, with the generation cost equal to dispatched_mw * hours_in_block * cost_per_mwh.

Multi-Segment Plants

A multi-segment curve models a plant whose heat rate increases at higher output, which is common for steam turbines and combined-cycle units. For example, a 657 MW plant that is efficient at partial load but increasingly expensive above 300 MW:

"cost_segments": [
  { "capacity_mw": 300.0, "cost_per_mwh": 50.0 },
  { "capacity_mw": 357.0, "cost_per_mwh": 80.0 }
]

The LP sees this as two separate generation variables that are constrained to be dispatched in order: the cheaper 300 MW segment fills first before the solver uses any of the 357 MW higher-cost segment. The total capacity is 300 + 357 = 657 MW, which must equal generation.max_mw.

Segments must be listed in ascending cost order as a convention, though the optimizer will find the merit-order dispatch regardless of the ordering in the file.

Capacity Sum Constraint

The sum of all capacity_mw values must equal generation.max_mw. This is validated by Cobre and reported as a physical feasibility error if violated:

physical error: thermal 1 cost_segments capacity sum (657.0 MW) does not match
  max_generation_mw (700.0 MW)

GNL Configuration

The optional gnl_config block enables GNL (Gás Natural Liquefeito, or liquefied natural gas) dispatch anticipation. This models thermal units that require advance scheduling over multiple stages due to commitment lead times — for example, an LNG-fired plant that must be booked several weeks before the dispatch occurs.

"gnl_config": {
  "lag_stages": 2
}
FieldTypeDescription
lag_stagesintegerNumber of stages of dispatch anticipation. A value of 2 means the generation commitment for stage t must be decided at stage t - 2.

When lag_stages is greater than zero, the LP structure couples the commitment decision at an earlier stage to the dispatch variable at a later stage. This is an advanced feature for detailed operational planning studies. For most long-term planning horizons where monthly stages are used and commitment detail is not the focus, the gnl_config field can be omitted.

When the gnl_config block is absent, there is no dispatch anticipation lag — the unit can be committed and dispatched independently in each stage’s LP.


Validation Rules

Cobre’s five-layer validation pipeline checks the following conditions on thermal units. Violations are reported as error messages with the failing unit’s id.

RuleError ClassDescription
Bus reference integrityReference errorEvery bus_id must match an id in buses.json.
Cost segment capacity sumPhysical feasibilityThe sum of all capacity_mw values in cost_segments must equal max_mw in the generation block.
Generation bounds orderingPhysical feasibilitymin_mw must be less than or equal to max_mw.
Non-empty cost segmentsSchema errorThe cost_segments array must contain at least one segment.
Positive segment capacityPhysical feasibilityEach segment’s capacity_mw must be strictly positive.
GNL lag validityPhysical feasibilityWhen gnl_config is present, lag_stages must be a non-negative integer.

Network Topology

The electrical network in Cobre describes how generators and loads are connected and how power can move between regions. At the heart of the network model is the bus: a named node at which power balance must be maintained every stage and every load block. Generators inject power into buses; loads withdraw power from buses; transmission lines transfer power between buses.

The simplest possible model is a single-bus (copper-plate) system: one bus that aggregates all generation and all load into a single node. In a copper-plate model there are no flow limits, no transmission losses, and no geographical differentiation in price or dispatch. The 1dtoy template uses a single-bus configuration. This is the right starting point for system-level capacity planning studies where the internal transmission network is not the focus.

A multi-bus system introduces two or more buses connected by transmission lines. Lines impose flow limits between buses. When a line’s capacity is binding, each bus has its own locational marginal price, and the dispatch in one region cannot freely substitute for a deficit in another. Multi-bus models are appropriate when regional subsystems have constrained interconnections that influence dispatch, investment decisions, or price formation.


Buses

Every generator and every load must be attached to a bus. Buses are defined in system/buses.json under a top-level "buses" array.

JSON Schema

{
  "buses": [
    {
      "id": 0,
      "name": "SIN",
      "deficit_segments": [
        {
          "depth_mw": null,
          "cost": 1000.0
        }
      ]
    }
  ]
}

This is the complete buses.json from the 1dtoy example: one bus with a single unbounded deficit segment at 1000 $/MWh. The excess_cost field is optional and comes from the global penalties.json when not specified per-bus.

Core Fields

FieldTypeRequiredDescription
idintegerYesUnique non-negative integer identifier. Must be unique across all buses.
namestringYesHuman-readable bus name. Used in output files, validation messages, and log output.
deficit_segmentsarrayNoPiecewise-linear deficit cost curve. Overrides the global defaults from penalties.json for this bus. See Deficit Modeling.
excess_costnumberNoPenalty per MWh of surplus generation absorbed by this bus ($/MWh). Overrides the global default from penalties.json.

Bus Balance Constraint

For every bus b, every stage t, and every load block k, the LP enforces:

  generation_injected(b, t, k)
  + imports_from_lines(b, t, k)
  + deficit(b, t, k)
  = load_demand(b, t, k)
  + exports_to_lines(b, t, k)
  + excess(b, t, k)

deficit and excess are non-negative slack variables added to the LP objective at their respective penalty costs. The deficit slack makes the problem feasible when there is not enough generation to meet demand. The excess slack absorbs surplus generation when more power is produced than can be consumed or transmitted away.


Deficit Modeling

Deficit represents unserved load — demand that the solver cannot cover with available generation. The deficit cost is the Value of Lost Load (VoLL) from the solver’s perspective: the penalty the LP pays per MWh of unserved demand.

Deficit Segments

Rather than a single flat VoLL, Cobre models deficit costs as a piecewise-linear curve: a sequence of segments with increasing costs. The segments are cumulative. The first segment covers the first depth_mw MW of deficit at the lowest cost, the second segment covers the next depth_mw MW at a higher cost, and so on.

"deficit_segments": [
  { "depth_mw": 500.0, "cost": 1000.0 },
  { "depth_mw": null,  "cost": 5000.0 }
]

In this two-segment example, the first 500 MW of deficit costs 1000 $/MWh. Any deficit above 500 MW costs 5000 $/MWh. The final segment must have depth_mw: null (unbounded), which guarantees the LP can always find a feasible solution regardless of the generation shortfall.

FieldTypeDescription
depth_mwnumber or nullMW of deficit covered by this segment. null for the final unbounded segment.
costnumberPenalty cost per MWh of deficit in this segment [$/MWh]. Must be positive. Segments should be in ascending cost.

Three-Tier Penalty Resolution

Deficit segment values are resolved from the most specific to the most general source:

  1. Stage-level override — penalty files for individual stages, when present
  2. Bus-level override — the deficit_segments array inside the bus’s JSON object
  3. Global default — the bus.deficit_segments section of penalties.json

When deficit_segments is omitted from a bus definition, Cobre uses the global default from penalties.json. This makes it easy to set a system-wide VoLL and then override it for specific buses with different reliability requirements.

Choosing Deficit Costs

A typical two-tier configuration uses a moderate cost for the first tier (to allow partial deficit in extreme scenarios without distorting the optimality cuts too much) and a very high cost for the unbounded final tier (to make full deficit a last resort). Values of 1000–5000 $/MWh for the first tier and 5000–20000 $/MWh for the final tier are common in practice.

Setting the deficit cost too low relative to thermal generation costs will cause the solver to prefer deficit over building reserves, which misrepresents the cost of unserved energy. Setting it too high can cause numerical conditioning issues in the LP; in practice, values above 100 000 $/MWh are rarely necessary.


Lines

Transmission lines connect pairs of buses and impose flow limits on power transfer between them. Lines are defined in system/lines.json under a top-level "lines" array. A single-bus system has an empty lines array.

JSON Schema

The following example shows a two-bus system with a single connecting line:

{
  "lines": [
    {
      "id": 0,
      "name": "North-South Interconnection",
      "source_bus_id": 0,
      "target_bus_id": 1,
      "entry_stage_id": null,
      "exit_stage_id": null,
      "direct_capacity_mw": 1000.0,
      "reverse_capacity_mw": 800.0,
      "losses_percent": 2.5,
      "exchange_cost": 1.0
    }
  ]
}

This line allows up to 1000 MW to flow from bus 0 to bus 1, and up to 800 MW in the reverse direction. A 2.5% transmission loss is applied to all flow. The exchange_cost is a regularization penalty, not a physical cost.

Core Fields

FieldTypeRequiredDescription
idintegerYesUnique non-negative integer identifier. Must be unique across all lines.
namestringYesHuman-readable line name. Used in output files, validation messages, and log output.
source_bus_idintegerYesBus ID at the source end. Defines the “direct” flow direction. Must match an id in buses.json.
target_bus_idintegerYesBus ID at the target end. Must match an id in buses.json. Must differ from source_bus_id.
entry_stage_idinteger or nullNoStage at which the line enters service (inclusive). null means available from stage 0.
exit_stage_idinteger or nullNoStage at which the line is decommissioned (inclusive). null means never decommissioned.
direct_capacity_mwnumberYesMaximum flow from source to target [MW]. Hard upper bound on the flow variable.
reverse_capacity_mwnumberYesMaximum flow from target to source [MW]. Hard upper bound on the reverse flow variable.
losses_percentnumberYesTransmission losses as a percentage of transmitted power (e.g., 2.5 means 2.5%). Set to 0.0 for lossless transfer.
exchange_costnumberYesRegularization penalty per MWh of flow [$/MWh]. Not a physical cost — see note below.

Exchange Cost Note

The exchange_cost is not a tariff or a physical transmission cost — it is a regularization penalty added to the LP objective to give the solver a strict preference between equivalent dispatch solutions. Without any exchange cost, the solver is indifferent between using or not using a lossless, uncongested line, which can cause oscillations between equivalent solutions across iterations.

A small exchange cost (0.5–2.0 $/MWh) breaks this degeneracy without meaningfully distorting the economic dispatch. The global default is set in penalties.json under line.exchange_cost. Per-line overrides are not currently supported; the global value applies to all lines.


Transmission Losses

When losses_percent is non-zero, the power arriving at the target bus is less than the power leaving the source bus. If bus A sends F MW to bus B over a line with 2.5% losses, then:

  • Bus A’s balance sees an outflow of F MW
  • Bus B’s balance sees an inflow of F * (1 - 0.025) = 0.975 * F MW

The lost power (0.025 * F MW) does not appear anywhere in the network — it represents heat dissipated in the conductor. From the LP’s perspective, losses increase the effective cost of transferring power: the source bus must generate more to deliver the same amount at the target bus.

Setting losses_percent: 0.0 models a lossless (superconductive) connection. This is appropriate for short, high-voltage DC links or for cases where transmission losses are not a modeling concern.


Single-Bus vs Multi-Bus

When to use a single-bus model

A single bus (copper-plate) is appropriate when:

  • You are building an initial case and want to isolate dispatch economics from network effects
  • Transmission constraints are not binding in the scenarios you are studying
  • The system is geographically compact with ample interconnection capacity
  • You are validating the stochastic model before adding network complexity

The 1dtoy template is a single-bus case. All generators and loads connect to bus 0 (SIN), and lines.json contains an empty array.

When to use a multi-bus model

A multi-bus model is appropriate when:

  • Different regions have distinct generation mixes and load profiles
  • Transmission capacity is a binding constraint that affects dispatch or pricing
  • You need locational marginal prices for investment decisions or contract pricing
  • You are modeling a system where curtailment of cheap generation (wind in one region, hydro in another) is caused by transmission congestion

Adding a second bus

To extend the 1dtoy template to two buses, add a second bus to buses.json:

{
  "buses": [
    { "id": 0, "name": "North" },
    { "id": 1, "name": "South" }
  ]
}

Then add a line to lines.json:

{
  "lines": [
    {
      "id": 0,
      "name": "North-South",
      "source_bus_id": 0,
      "target_bus_id": 1,
      "direct_capacity_mw": 500.0,
      "reverse_capacity_mw": 500.0,
      "losses_percent": 1.0,
      "exchange_cost": 1.0
    }
  ]
}

Assign each generator and load to the appropriate bus by setting its bus_id. When you run cobre validate, the validator will confirm that all bus_id references resolve to existing buses.


Validation Rules

Cobre’s five-layer validation pipeline checks the following conditions for buses and lines. Violations are reported as error messages with the failing entity’s id.

RuleError ClassDescription
Bus reference integrityReference errorEvery bus_id on any entity (hydro, thermal, contract, line, etc.) must match an id in buses.json.
Line source bus existenceReference errorsource_bus_id on each line must match an id in buses.json.
Line target bus existenceReference errortarget_bus_id on each line must match an id in buses.json.
No self-loopsPhysical feasibilitysource_bus_id and target_bus_id must differ on every line. A line from a bus to itself is not meaningful.
Deficit segment orderingPhysical feasibilityDeficit segments must be listed with ascending costs. The final segment must have depth_mw: null.
Unbounded final segmentPhysical feasibilityThe last entry in every deficit_segments array must have depth_mw: null to guarantee LP feasibility.
Non-negative capacityPhysical feasibilitydirect_capacity_mw and reverse_capacity_mw must be non-negative.
Non-negative lossesPhysical feasibilitylosses_percent must be in the range [0, 100).

When a bus ID referenced by a generator does not exist in buses.json, the validator reports the error as:

reference error: thermal 2 references bus 99 which does not exist

Fix the bus_id or add the missing bus and re-run cobre validate until the exit code is 0.


Stochastic Modeling

Hydrothermal dispatch is inherently uncertain. Reservoir inflows depend on rainfall and snowmelt that cannot be known in advance, and electrical load varies in ways that are predictable in aggregate but noisy at any given moment. A dispatch policy that ignores uncertainty will systematically under-prepare for dry periods and over-commit thermal capacity in wet years.

Cobre addresses this by treating inflows and loads as stochastic processes. During training, the solver samples many scenario trajectories and builds a policy that performs well across the distribution of possible futures — not just for a single forecast. The stochastic layer is responsible for generating those scenario trajectories in a statistically sound, reproducible way.

The stochastic models are driven by historical statistics provided by the user in the scenarios/ directory of the case. If no scenarios/ directory is present, Cobre falls back to white-noise generation using only the stage definitions in stages.json. For any study with real hydro plants, providing historical inflow statistics is strongly recommended.


The scenarios/ Directory

The scenarios/ directory sits alongside the other input files in the case directory:

my_study/
  config.json
  stages.json
  ...
  scenarios/
    inflow_seasonal_stats.parquet
    load_seasonal_stats.parquet
    inflow_ar_coefficients.parquet    (only when ar_order > 0)

The directory is optional. When it is absent, Cobre generates independent standard-normal noise at each stage for each hydro plant and scales it by a default standard deviation — effectively treating all uncertainty as white noise. This is sufficient for verifying a case loads correctly, but is not representative of real inflow dynamics.

When scenarios/ is present, Cobre reads the Parquet files and fits a Periodic Autoregressive (PAR(p)) model for each hydro plant and each bus. The fitted model generates correlated, seasonally-varying inflow and load trajectories that reflect the historical statistics you supply.


Inflow Statistics

inflow_seasonal_stats.parquet provides the seasonal distribution of historical inflows for every (hydro plant, stage) pair.

Schema

ColumnTypeNullableDescription
hydro_idINT32NoHydro plant identifier (matches id in hydros.json)
stage_idINT32NoStage identifier (matches id in stages.json)
mean_m3sDOUBLENoSeasonal mean inflow in m³/s
std_m3sDOUBLENoSeasonal standard deviation in m³/s (must be >= 0)
ar_orderINT32NoNumber of AR lags in the PAR(p) model (0 = white noise)

The file must contain exactly one row per (hydro_id, stage_id) pair. Every hydro plant defined in hydros.json must have a row for every stage defined in stages.json. The validator will reject the case if any combination is missing.

For the 1dtoy example, the file has 4 rows — one for each of the four monthly stages — for the single hydro plant UHE1 (hydro_id = 0).

Inspecting the file

# Polars
import polars as pl
df = pl.read_parquet("scenarios/inflow_seasonal_stats.parquet")
print(df)
# Pandas
import pandas as pd
df = pd.read_parquet("scenarios/inflow_seasonal_stats.parquet")
print(df)
-- DuckDB
SELECT * FROM read_parquet('scenarios/inflow_seasonal_stats.parquet');
# R with arrow
library(arrow)
df <- read_parquet("scenarios/inflow_seasonal_stats.parquet")
print(df)

Load Statistics

load_seasonal_stats.parquet provides the seasonal distribution of electrical demand at each bus. It drives the stochastic load model used during training and simulation.

Schema

ColumnTypeNullableDescription
bus_idINT32NoBus identifier (matches id in buses.json)
stage_idINT32NoStage identifier (matches id in stages.json)
mean_mwDOUBLENoSeasonal mean load in MW
std_mwDOUBLENoSeasonal standard deviation in MW (must be >= 0)
ar_orderINT32NoNumber of AR lags in the PAR(p) model (0 = white noise)

One row per (bus_id, stage_id) pair is required. Every bus in buses.json must have a row for every stage. The load mean and standard deviation determine both the expected demand level and how much it varies across scenarios in each stage.


The PAR(p) Model

PAR(p) stands for Periodic Autoregressive model of order p. It is the standard model for hydro inflow time series in long-term hydrothermal planning because inflows have two key properties the model captures well: seasonal patterns (wet seasons and dry seasons recur predictably each year) and autocorrelation (a wet month tends to be followed by another wet month, and vice versa).

What ar_order controls

The ar_order column in the seasonal statistics files sets the number of autoregressive lags for each (entity, stage) pair.

ar_order = 0 — white noise. The inflow at each stage is drawn independently from a normal distribution with the specified mean and standard deviation. There is no memory between stages: knowing last month’s inflow tells you nothing about this month’s. This is the simplest setting and appropriate when you lack historical data to fit AR coefficients, or when the inflow series shows very little autocorrelation.

ar_order > 0 — periodic autoregressive. The inflow at each stage depends on the inflows at the preceding p stages, weighted by coefficients that reflect the seasonal autocorrelation structure. A wet period is followed by another wet period with the probability implied by the coefficients. Higher AR orders capture longer-range dependencies: ar_order = 1 captures month-to-month persistence, ar_order = 2 adds two-month memory, and so on. Most hydro inflow series are well-described by ar_order = 1 or ar_order = 2.

AR coefficients file

When any stage in inflow_seasonal_stats.parquet has ar_order > 0, Cobre also requires an inflow_ar_coefficients.parquet file in the scenarios/ directory. This file contains the fitted AR coefficients in standardized form (as produced by the Yule-Walker equations). The schema and the fitting procedure are documented in the Case Format Reference.

The 1dtoy example uses ar_order = 0 for all stages, so no coefficients file is needed.

When to use higher AR orders

In general:

  • Use ar_order = 0 when historical data is short or when you want to establish a baseline with the simplest possible model.
  • Use ar_order = 1 for most real hydro systems. Monthly inflows have strong one-month autocorrelation, and a first-order model captures the bulk of it.
  • Use ar_order = 2 or higher when the inflow series shows multi-month persistence (common in systems with large upstream catchments or snowmelt storage). Validate with autocorrelation plots of your historical data.
  • Setting ar_order > 0 with std_m3s = 0 is a validation error — the model requires non-zero variance to be identifiable.

For the theoretical derivation of the PAR(p) model, see Stochastic Modeling and PAR(p) Autoregressive Models in the methodology reference.


Correlation

Hydro plants that share a watershed tend to have correlated inflows: when the upstream basin receives heavy rainfall, all plants along the river benefit simultaneously. Ignoring this correlation can cause the optimizer to underestimate the risk of a system-wide dry spell.

Default behavior: independent noise

When no correlation configuration is provided, Cobre treats each hydro plant’s inflow as independent of all others. Each plant draws its own noise realization at each stage without any coupling. This is the correct setting for the 1dtoy example, which has only one hydro plant.

Configuring spatial correlation

For multi-plant systems, Cobre supports Cholesky-based spatial correlation. A correlation model is specified in correlation.json in the case directory and defines named correlation groups, each with a symmetric positive-definite correlation matrix.

{
  "method": "cholesky",
  "profiles": {
    "default": {
      "groups": [
        {
          "name": "basin_south",
          "entities": [
            { "type": "inflow", "id": 0 },
            { "type": "inflow", "id": 1 }
          ],
          "matrix": [
            [1.0, 0.7],
            [0.7, 1.0]
          ]
        }
      ]
    }
  }
}

Entities not listed in any group retain independent noise. Multiple profiles can be defined and scheduled to activate for specific stages (for example, using a wet-season correlation structure in January through March and a dry-season structure for the remaining months). Detailed correlation configuration documentation will be added with future multi-plant example cases.


Scenario Count and Seeds

num_scenarios in stages.json

Each stage in stages.json has a num_scenarios field that controls how many scenario branches are pre-generated for the opening scenario tree used during the backward pass. A larger value gives the backward pass more diverse inflow realizations to evaluate cuts against, at the cost of a proportionally larger opening tree in memory. For the 1dtoy example this is set to 10. Production studies typically use 50 to 200.

forward_passes in config.json

The forward_passes field in config.json controls how many scenario trajectories are sampled during each training iteration’s forward pass. This is distinct from num_scenarios: the forward pass draws new trajectories on each iteration using a deterministic per-iteration seed, while num_scenarios controls the pre-generated backward-pass tree.

The seed field

The seed field in the training section of config.json is the base seed for all stochastic generation in the run:

{
  "training": {
    "forward_passes": 50,
    "seed": 42,
    "stopping_rules": [{ "type": "iteration_limit", "limit": 200 }]
  }
}

The default value is 42 when seed is omitted. When a seed is provided, every run with the same case directory and the same seed produces bitwise-identical scenarios, training trajectories, and simulation results. This reproducibility is guaranteed regardless of the number of MPI ranks, because each rank derives its scenario seeds independently from the base seed using a deterministic hash — no inter-rank coordination is required.

To get a non-reproducible run (different scenarios each time), set "seed": null in config.json. Cobre will then derive the base seed from OS entropy at startup.


Inflow Non-Negativity

Normal distributions used in PAR(p) models have unbounded support: even with a positive mean, there is a non-zero probability of drawing a negative noise realisation that, after applying the AR dynamics, produces a negative inflow value. Negative inflow has no physical meaning and, if uncorrected, would violate water balance constraints in the LP.

Method in v0.1.0: penalty

Cobre v0.1.0 uses the penalty method to handle negative inflow realisations. A high-cost slack variable is added to each water balance row. When the LP solver encounters a scenario where the inflow would be negative, it draws on this virtual inflow at the penalty cost rather than violating the balance constraint. The penalty cost is configurable via the inflow_non_negativity field in the case configuration; the default keeps it high enough that the slack is used only when necessary.

In practice, the penalty is rarely activated in well-specified studies. It acts as a backstop for low-probability tail realisations.

Truncation methods: planned for a future release

Two additional methods from the literature — truncation (modifying LP row bounds based on external AR evaluation) and truncation with penalty (combining bounded slack with modified bounds) — are planned for a future release. These require evaluating the full inflow value a_h as a scalar before LP patching, which is a non-trivial architectural change in v0.1.0.

For the mathematical theory behind all three methods, see the Inflow Non-Negativity page in the methodology reference, or Oliveira et al. (2022), Energies 15(3):1115.


  • Anatomy of a Case — introductory walkthrough of the scenarios/ directory and Parquet schemas
  • Configuration — full documentation of config.json fields including seed and forward_passes
  • cobre-stochastic — internal architecture of the stochastic crate: PAR preprocessing, Cholesky correlation, opening tree, and seed derivation

Running Studies

End-to-end workflow for running an SDDP study with cobre run, interpreting output, and inspecting results.


Preparing a Case Directory

A case directory is a folder containing all input data files required by Cobre. The minimum required structure is:

my_study/
  config.json
  penalties.json
  stages.json
  initial_conditions.json
  system/
    buses.json
    hydros.json
    thermals.json
    lines.json

All eight files are required. Before running, validate the input:

cobre validate /path/to/my_study

Successful validation prints entity counts and exits with code 0. Fix any reported errors before proceeding. See Case Directory Format for the full schema.


Running cobre run

cobre run /path/to/my_study

By default, results are written to <CASE_DIR>/output/. To specify a different location:

cobre run /path/to/my_study --output /path/to/results

Lifecycle Stages

  1. Load — reads input files, runs 5-layer validation (exits code 1 on validation failure, 2 on I/O error)
  2. Train — builds the SDDP policy by iterating forward/backward passes; stops when stopping rules are met
  3. Simulate — (optional) evaluates the policy over independent scenarios; requires simulation.enabled = true
  4. Write — writes Hive-partitioned Parquet (tabular), JSON manifests/metadata, and FlatBuffers output

Terminal Output

When stdout is a terminal, a banner shows the version and solver backend. Suppress with --no-banner (keeps progress bars) or --quiet (suppresses all except errors).

Progress Bars

During training, a progress bar shows current iteration count. In --quiet mode, no progress bars are printed. Errors are always written to stderr.

Summary

After all stages complete, a run summary is printed to stderr with:

  • Training: iteration count, convergence status, bounds, gap, cuts, solves, time
  • Simulation (when enabled): scenarios requested, completed, failed
  • Output directory: absolute path to results

Checking Results

Use cobre report to inspect the results:

cobre report /path/to/my_study/output

Reads manifest files and prints JSON to stdout (suitable for piping to jq):

cobre report /path/to/my_study/output | jq '.training.convergence.final_gap_percent'

Exits with code 0 on success or 2 if the results directory does not exist.


Common Workflows

Training Only

cobre run /path/to/my_study --skip-simulation

Trains the policy without simulation.

Quiet Mode for Scripts

cobre run /path/to/my_study --quiet
exit_code=$?
if [ $exit_code -ne 0 ]; then
  echo "Study failed with exit code $exit_code" >&2
fi

Suppresses banner and progress output, suitable for batch scripts.

Checking Exit Codes

Exit CodeMeaningAction
0SuccessResults are available in the output directory
1Validation errorFix the input data and re-run cobre validate
2I/O errorCheck file paths and permissions
3Solver errorCheck constraint bounds in the case data
4Internal errorCheck environment; report at the issue tracker

See CLI Reference for the full exit code table.

Configuration

All runtime parameters for cobre run are controlled by config.json in the case directory. This page documents every section and field.


Minimal Config

{
  "version": "2.0.0",
  "training": {
    "forward_passes": 50,
    "stopping_rules": [{ "type": "iteration_limit", "limit": 100 }]
  }
}

All other sections are optional with defaults documented below.


training

Controls the SDDP training phase.

Mandatory Fields

FieldTypeDescription
forward_passesintegerNumber of scenario trajectories per iteration. Larger values reduce variance in each iteration’s cut but increase cost per iteration.
stopping_rulesarrayAt least one stopping rule (see below). The rule set must contain at least one iteration_limit rule.

Optional Fields

FieldTypeDefaultDescription
enabledbooleantrueSet to false to skip training and proceed directly to simulation (requires a pre-trained policy).
seedinteger42Random seed for reproducible scenario generation.
stopping_mode"any" or "all""any"How multiple stopping rules combine: "any" stops when the first rule is satisfied; "all" requires all rules to be satisfied simultaneously.

Stopping Rules

Each entry in stopping_rules is a JSON object with a "type" discriminator.

iteration_limit

Stop after a fixed number of training iterations.

{ "type": "iteration_limit", "limit": 200 }
FieldTypeDescription
limitintegerMaximum number of SDDP iterations to run.

time_limit

Stop after a wall-clock time budget is exhausted.

{ "type": "time_limit", "seconds": 3600.0 }
FieldTypeDescription
secondsfloatMaximum training time in seconds.

bound_stalling

Stop when the relative improvement in the lower bound falls below a threshold.

{ "type": "bound_stalling", "iterations": 20, "tolerance": 0.0001 }
FieldTypeDescription
iterationsintegerWindow size: the number of past iterations over which to compute the relative improvement.
tolerancefloatRelative improvement threshold. Training stops when the improvement over the window is below this value.

stopping_mode

When multiple stopping rules are listed, stopping_mode controls how they combine:

  • "any" (default): stop when any one rule is satisfied.
  • "all": stop only when every rule is satisfied simultaneously.
{
  "training": {
    "forward_passes": 50,
    "stopping_mode": "all",
    "stopping_rules": [
      { "type": "iteration_limit", "limit": 500 },
      { "type": "bound_stalling", "iterations": 20, "tolerance": 0.0001 }
    ]
  }
}

simulation

Controls the optional post-training simulation phase.

FieldTypeDefaultDescription
enabledbooleanfalseEnable the simulation phase after training.
num_scenariosinteger2000Number of independent Monte Carlo simulation scenarios to evaluate.
policy_type"outer""outer"Policy representation for simulation. "outer" uses the cut pool (Benders cuts).

When simulation.enabled is false or num_scenarios is 0, the simulation phase is skipped regardless of the --skip-simulation flag.

Example:

{
  "simulation": {
    "enabled": true,
    "num_scenarios": 1000
  }
}

policy

Controls policy persistence (checkpoint saving and warm-start loading).

FieldTypeDefaultDescription
pathstring"./policy"Directory where policy data (cuts, states) is stored.
mode"fresh", "warm_start", or "resume""fresh"Initialization mode. "fresh" starts from scratch; "warm_start" loads cuts from a previous run; "resume" continues an interrupted run.
validate_compatibilitybooleantrueWhen loading a policy, verify that entity counts, stage counts, and cut dimensions match the current system.

exports

Controls which outputs are written to the results directory.

FieldTypeDefaultDescription
trainingbooleantrueWrite training convergence data (Parquet).
cutsbooleantrueWrite the cut pool (FlatBuffers).
statesbooleantrueWrite visited state vectors (Parquet).
verticesbooleantrueWrite inner approximation vertices when applicable (Parquet).
simulationbooleantrueWrite per-entity simulation results (Parquet).
forward_detailbooleanfalseWrite per-scenario forward-pass detail (large; disabled by default).
backward_detailbooleanfalseWrite per-scenario backward-pass detail (large; disabled by default).
compression"zstd", "lz4", or "none"nullOutput Parquet compression algorithm. null uses the crate default (zstd).

Full Example

{
  "$schema": "https://cobre.dev/schemas/v2/config.schema.json",
  "version": "2.0.0",
  "training": {
    "seed": 42,
    "forward_passes": 50,
    "stopping_rules": [
      { "type": "iteration_limit", "limit": 200 },
      { "type": "bound_stalling", "iterations": 20, "tolerance": 0.0001 }
    ],
    "stopping_mode": "any"
  },
  "simulation": {
    "enabled": true,
    "num_scenarios": 2000
  },
  "policy": {
    "path": "./policy",
    "mode": "fresh"
  },
  "exports": {
    "training": true,
    "cuts": true,
    "states": true,
    "simulation": true,
    "compression": "zstd"
  }
}

See Also

CLI Reference

Synopsis

cobre [--color <WHEN>] <SUBCOMMAND> [OPTIONS]

Global Options

OptionTypeDefaultDescription
--color <WHEN>auto | always | neverautoControl ANSI color output on stderr. always forces color on — useful under mpiexec which pipes stderr through a non-TTY. Also honoured via COBRE_COLOR.

Subcommands

SubcommandSynopsisDescription
runcobre run <CASE_DIR> [OPTIONS]Load, train, simulate, and write results
validatecobre validate <CASE_DIR>Validate a case directory and print a diagnostic report
reportcobre report <RESULTS_DIR>Query results from a completed run and print JSON to stdout
versioncobre versionPrint version, solver backend, and build information

cobre run

Executes the full solve lifecycle for a case directory:

  1. Load — reads all input files and runs the 5-layer validation pipeline
  2. Train — trains an SDDP policy using the configured stopping rules
  3. Simulate — (optional) evaluates the trained policy over simulation scenarios
  4. Write — writes all output files to the results directory

Arguments

ArgumentTypeDescription
<CASE_DIR>PathPath to the case directory containing input data files and config.json

Options

OptionTypeDefaultDescription
--output <DIR>Path<CASE_DIR>/output/Output directory for results
--threads <N>integer1Number of worker threads per MPI rank. Each thread solves its own LP instances; scenarios are distributed across threads. Resolves: --threads > COBRE_THREADS > 1.
--skip-simulationflagoffTrain only; skip the post-training simulation phase
--quietflagoffSuppress the banner and progress bars. Errors still go to stderr
--no-bannerflagoffSuppress the startup banner but keep progress bars
--verboseflagoffEnable debug-level logging for cobre_cli; info-level for library crates

Examples

# Run a study with default output location
cobre run /data/cases/hydro_study

# Write results to a custom directory
cobre run /data/cases/hydro_study --output /data/results/run_001

# Train only, no simulation
cobre run /data/cases/hydro_study --skip-simulation

# Use 4 worker threads per MPI rank
cobre run /data/cases/hydro_study --threads 4

# Run without any terminal decorations (useful in scripts)
cobre run /data/cases/hydro_study --quiet

# Force color output when running under mpiexec
cobre --color always run /data/cases/hydro_study

# Enable verbose logging to diagnose solver issues
cobre run /data/cases/hydro_study --verbose

cobre validate

Runs the 5-layer validation pipeline and prints a diagnostic report to stdout.

On success, prints entity counts:

Valid case: 3 buses, 12 hydros, 8 thermals, 4 lines
  buses: 3
  hydros: 12
  thermals: 8
  lines: 4

On failure, prints each error prefixed with error: and exits with code 1:

Validation Error Demo

Arguments

ArgumentTypeDescription
<CASE_DIR>PathPath to the case directory to validate

Options

None.

Examples

# Validate a case directory before running
cobre validate /data/cases/hydro_study

# Use in a script: only proceed if validation passes
cobre validate /data/cases/hydro_study && cobre run /data/cases/hydro_study

cobre report

Reads the JSON manifests written by cobre run and prints a JSON summary to stdout.

The output has the following top-level shape:

{
  "output_directory": "/abs/path/to/results",
  "status": "complete",
  "training": { "iterations": {}, "convergence": {}, "cuts": {} },
  "simulation": { "scenarios": {} },
  "metadata": { "run_info": {}, "configuration_snapshot": {} }
}

simulation and metadata are null when the corresponding files are absent (e.g., when --skip-simulation was used).

Arguments

ArgumentTypeDescription
<RESULTS_DIR>PathPath to the results directory produced by cobre run

Options

None.

Examples

# Print the full report to the terminal
cobre report /data/cases/hydro_study/output

# Extract the convergence gap using jq
cobre report /data/cases/hydro_study/output | jq '.training.convergence.final_gap_percent'

# Check the run status in a script
status=$(cobre report /data/cases/hydro_study/output | jq -r '.status')
if [ "$status" = "complete" ]; then
  echo "Training converged"
fi

cobre version

Prints the binary version, active solver and communication backends, compression support, host architecture, and build profile.

Output Format

cobre   v0.1.0
solver: HiGHS
comm:   local
zstd:   enabled
arch:   x86_64-linux
build:  release (lto=thin)
LineDescription
cobre v{version}Binary version from Cargo.toml
solver: HiGHSActive LP solver backend (HiGHS in all standard builds)
comm: local or comm: mpiCommunication backend (mpi only when compiled with the mpi feature)
zstd: enabledOutput compression support
arch: {arch}-{os}Host CPU architecture and operating system
build: release or build: debugCargo build profile

Arguments

None.

Options

None.


Exit Codes

All subcommands follow the same exit code convention.

CodeCategoryCause
0SuccessThe command completed without errors
1ValidationCase directory failed the validation pipeline — schema errors, cross-reference errors, semantic constraint violations, or policy compatibility mismatches
2I/OFile not found, permission denied, disk full, or write failure during loading or output
3SolverLP infeasible subproblem or numerical solver failure during training or simulation
4InternalCommunication failure, unexpected channel closure, or other software/environment problem

Codes 1–2 indicate user-correctable input problems; codes 3–4 indicate case/environment problems. Error messages are printed to stderr with error: prefix and hint lines. See Error Codes for a detailed catalog.


Environment Variables

VariableDescription
COBRE_COMM_BACKENDOverride the communication backend at runtime. Set to local to force the local backend even when the binary was compiled with mpi support.
COBRE_THREADSNumber of worker threads per MPI rank for cobre run. Overridden by the --threads flag. Must be a positive integer.
COBRE_COLOROverride color output when --color auto is in effect. Set to always or never. Ignored if --color always or --color never is given explicitly.
FORCE_COLORForce color output on (any non-empty value). Checked after COBRE_COLOR. See force-color.org.
NO_COLORDisable colored terminal output. Respected by the banner and error formatters. Set to any non-empty value.
RUST_LOGControl the tracing subscriber log level using standard env_logger syntax (e.g., RUST_LOG=debug, RUST_LOG=cobre_sddp=trace). Takes effect when --verbose is also passed.

Interpreting Results

The Understanding Results tutorial explains what each output file contains and how to read it. This page goes one level deeper: it provides practical analysis patterns for answering domain questions from the data. It assumes you have already completed the tutorial and are comfortable loading Parquet files in your preferred tool.

The focus is on convergence diagnostics and simulation analysis. By the end of this page you will know how to assess whether a run converged, how to extract generation and cost statistics across scenarios, and how to identify common problems from the output data.


Convergence Diagnostics

Reading the gap from training/_manifest.json

The manifest is the first place to check after any run. The key fields for convergence assessment are:

{
  "convergence": {
    "achieved": false,
    "final_gap_percent": 0.6,
    "termination_reason": "iteration_limit"
  },
  "iterations": {
    "completed": 128,
    "converged_at": null
  }
}
FieldWhat to look for
convergence.achievedtrue means a stopping rule declared convergence. false means the run exhausted its iteration budget.
convergence.final_gap_percentThe gap between lower and upper bounds at termination. Smaller is better. See guidelines below.
convergence.termination_reason"iteration_limit" is the most common; "bound_stalling" means the gap stopped shrinking.
iterations.converged_atNon-null only when achieved is true. Tells you how many iterations the run actually needed.

Gap guidelines. There is no universal threshold — acceptable gap depends on the decision being made and the study’s time horizon. As rough guidance:

  • Below 1%: typically very good. The policy cost is within 1% of the theoretical optimum.
  • 1% to 5%: acceptable for long-horizon planning studies where model uncertainty is already large.
  • Above 5%: warrants investigation. The policy may be significantly suboptimal.

What to do if the gap is large:

  1. Increase limit in the iteration_limit stopping rule.
  2. Increase forward_passes in config.json to reduce noise in the upper bound estimate per iteration.
  3. Check training/convergence.parquet (see next section) to see whether the gap is still decreasing or has plateaued.
  4. Check for solver infeasibilities: if simulation/_manifest.json shows failed scenarios, the policy may be encountering numerically difficult stages.

Reading Convergence History

training/convergence.parquet contains one row per training iteration with the full convergence history. Its schema:

ColumnTypeDescription
iterationINT32Iteration number (0-indexed)
lower_boundFLOAT64Optimizer’s proven lower bound on the expected cost
upper_bound_meanFLOAT64Statistical upper bound estimate (mean over forward passes)
upper_bound_stdFLOAT64Standard deviation of the upper bound estimate
gap_percentFLOAT64Relative gap as a percentage (null when lower_bound <= 0)
cuts_addedINT32Cuts added to the pool in this iteration
cuts_removedINT32Cuts removed by the cut selection strategy
cuts_activeINT64Total active cuts across all stages after this iteration
lp_solvesINT64Cumulative LP solves up to this iteration

Python (Polars)

import polars as pl
import matplotlib.pyplot as plt

df = pl.read_parquet("results/training/convergence.parquet")

# Plot convergence bounds over iterations
plt.figure(figsize=(10, 4))
plt.plot(df["iteration"], df["lower_bound"], label="Lower bound")
plt.plot(df["iteration"], df["upper_bound_mean"], label="Upper bound (mean)")
plt.fill_between(
    df["iteration"].to_list(),
    (df["upper_bound_mean"] - df["upper_bound_std"]).to_list(),
    (df["upper_bound_mean"] + df["upper_bound_std"]).to_list(),
    alpha=0.2,
    label="Upper bound ± 1 std",
)
plt.xlabel("Iteration")
plt.ylabel("Expected cost ($/stage)")
plt.legend()
plt.tight_layout()
plt.show()

# Check final gap
final = df.filter(pl.col("iteration") == df["iteration"].max())
print(final.select(["iteration", "lower_bound", "upper_bound_mean", "gap_percent"]))

R

library(arrow)
library(ggplot2)

df <- read_parquet("results/training/convergence.parquet")

# Plot convergence bounds
ggplot(df, aes(x = iteration)) +
  geom_line(aes(y = lower_bound, color = "Lower bound")) +
  geom_line(aes(y = upper_bound_mean, color = "Upper bound")) +
  geom_ribbon(
    aes(
      ymin = upper_bound_mean - upper_bound_std,
      ymax = upper_bound_mean + upper_bound_std
    ),
    alpha = 0.2
  ) +
  labs(
    x = "Iteration",
    y = "Expected cost ($/stage)",
    color = NULL
  ) +
  theme_minimal()

# Print final gap
tail(df[, c("iteration", "lower_bound", "upper_bound_mean", "gap_percent")], 1)

What to look for in the convergence plot:

  • Both bounds should move toward each other over iterations. The lower bound rises; the upper bound mean falls and its standard deviation narrows.
  • A lower bound that stays flat after the first few iterations suggests the backward pass cuts are not improving: check cuts_added to confirm cuts are being generated.
  • An upper bound that oscillates widely without narrowing suggests the forward_passes count is too low to produce a stable estimate.

Analyzing Simulation Results

The simulation output is Hive-partitioned: results are stored in one data.parquet file per scenario under simulation/<category>/scenario_id=NNNN/. Polars, Pandas, R arrow, and DuckDB all support reading the entire directory as a single table and filtering by scenario_id at the storage layer.

Aggregating across scenarios

The most common operation is computing statistics across all scenarios for a given entity or stage.

Python (Polars) — mean and percentiles:

import polars as pl

# Load all hydro results across all scenarios
hydros = pl.read_parquet("results/simulation/hydros/")

# Mean generation per hydro plant per stage, across all scenarios
mean_gen = (
    hydros
    .group_by(["hydro_id", "stage_id"])
    .agg(
        pl.col("generation_mwh").mean().alias("mean_generation_mwh"),
        pl.col("generation_mwh").quantile(0.10).alias("p10_generation_mwh"),
        pl.col("generation_mwh").quantile(0.90).alias("p90_generation_mwh"),
    )
    .sort(["hydro_id", "stage_id"])
)
print(mean_gen)

R:

library(arrow)
library(dplyr)

# Load all hydro results
hydros <- open_dataset("results/simulation/hydros/") |> collect()

# Mean and P10/P90 generation per hydro plant per stage
mean_gen <- hydros |>
  group_by(hydro_id, stage_id) |>
  summarise(
    mean_generation_mwh = mean(generation_mwh),
    p10_generation_mwh  = quantile(generation_mwh, 0.10),
    p90_generation_mwh  = quantile(generation_mwh, 0.90),
    .groups = "drop"
  ) |>
  arrange(hydro_id, stage_id)

print(mean_gen)

Filtering to a single scenario

# Polars — read only scenario 0 (avoids loading all partitions)
costs_s0 = pl.read_parquet(
    "results/simulation/costs/",
    hive_partitioning=True,
).filter(pl.col("scenario_id") == 0)
-- DuckDB
SELECT * FROM read_parquet('results/simulation/costs/**/*.parquet')
WHERE scenario_id = 0
ORDER BY stage_id;

Common Analysis Tasks

(a) Expected generation by hydro plant

import polars as pl

hydros = pl.read_parquet("results/simulation/hydros/")
expected = (
    hydros
    .group_by("hydro_id")
    .agg(pl.col("generation_mwh").mean().alias("mean_annual_generation_mwh"))
    .sort("hydro_id")
)
print(expected)

(b) Expected thermal generation cost

thermals = pl.read_parquet("results/simulation/thermals/")
thermal_cost = (
    thermals
    .group_by("thermal_id")
    .agg(pl.col("generation_cost").mean().alias("mean_total_cost"))
    .sort("thermal_id")
)
print(thermal_cost)

In R:

library(arrow)
library(dplyr)

thermals <- open_dataset("results/simulation/thermals/") |> collect()

thermal_cost <- thermals |>
  group_by(thermal_id) |>
  summarise(mean_total_cost = mean(generation_cost), .groups = "drop") |>
  arrange(thermal_id)

print(thermal_cost)

(c) Deficit probability per bus

A scenario has a deficit at a given stage if deficit_mwh > 0 for any bus in that stage. The deficit probability is the fraction of scenarios where this occurs.

buses = pl.read_parquet("results/simulation/buses/")
n_scenarios = buses["scenario_id"].n_unique()

deficit_prob = (
    buses
    .group_by(["bus_id", "stage_id"])
    .agg(
        (pl.col("deficit_mwh") > 0).mean().alias("deficit_probability")
    )
    .sort(["bus_id", "stage_id"])
)
print(deficit_prob)

(d) Water value (shadow price) from hydro output

The water_value_per_hm3 column in simulation/hydros/ records the shadow price of reservoir storage at each stage — the marginal value of having one additional hm³ of stored water. This is the water value, a key output of the SDDP policy.

hydros = pl.read_parquet("results/simulation/hydros/")
water_value = (
    hydros
    .group_by(["hydro_id", "stage_id"])
    .agg(pl.col("water_value_per_hm3").mean().alias("mean_water_value"))
    .sort(["hydro_id", "stage_id"])
)
print(water_value)

A high water value at a given stage means the reservoir is scarce relative to expected future demand — the solver is conserving water for later stages. A water value near zero means the reservoir is abundant and water has little marginal value at that point in time.


Using cobre report

cobre report provides a quick machine-readable summary without loading any Parquet files:

cobre report results/

Use it in scripts or CI pipelines to extract a specific metric without writing a data loading script:

# Check the final gap in a CI pipeline
gap=$(cobre report results/ | jq '.training.convergence.final_gap_percent')
echo "Final gap: ${gap}%"

For all available cobre report fields and flags, see CLI Reference.


Troubleshooting

Gap not converging

The gap stays large after many iterations, or the lower bound rises very slowly.

Possible causes:

  • Too few iterations. The most common cause. Increase the iteration_limit.
  • Too few forward passes. A forward_passes count of 1 (as in the 1dtoy tutorial) gives high variance in the upper bound estimate. Increase to 10 or more for a stable gap reading.
  • Numerically difficult stages. Check training/convergence.parquet for iterations where cuts_added is zero — this can indicate stages where the backward pass is not generating improving cuts.
  • Policy horizon issues. Verify stages.json has the correct stage ordering and that policy_graph.type is set correctly.

Unexpected deficit

Simulation scenarios show non-zero deficit_mwh in simulation/buses/ but the system should have enough capacity.

Possible causes:

  • Insufficient thermal capacity. Compare total load (load_mw summed across buses) against total thermal capacity. If load exceeds generation capacity in some scenarios, deficit is unavoidable.
  • Hydro reservoir ran dry. Check storage_final_hm3 in simulation/hydros/. If it hits zero in early stages, subsequent stages have no hydro generation and may resort to deficit.
  • Very low deficit penalty. If deficit_segments in penalties.json are priced below thermal generation cost, the solver will prefer deficit over generation. Increase the deficit cost.

Zero generation from a plant

A thermal or hydro plant shows zero generation in all scenarios.

Possible causes:

  • Plant is more expensive than deficit. Check the plant’s cost against the bus deficit penalty. If the cost exceeds the penalty, deficit is cheaper and the solver avoids dispatching the plant.
  • Bus connectivity. Verify the plant’s bus_id matches a bus that actually has load. A plant connected to a zero-load bus will never be dispatched.
  • Hydro: reservoir constraints too tight. If min_storage_hm3 is close to the initial storage level, the solver cannot turbine water without risking a storage violation. Review initial_conditions.json and storage bounds in hydros.json.

Crate Overview

Cobre is organized as a Rust workspace with 11 crates. Each crate has a single responsibility and well-defined boundaries.

cobre/crates/
├── cobre-core/         Entity model (buses, hydros, thermals, lines)
├── cobre-io/           JSON/Parquet input, FlatBuffers/Parquet output
├── cobre-stochastic/   PAR(p) models, scenario generation
├── cobre-solver/       LP solver abstraction (HiGHS backend)
├── cobre-comm/         Communication abstraction (MPI, TCP, shm, local)
├── cobre-sddp/         SDDP training loop, simulation, cut management
├── cobre-cli/          Binary: run/validate/report/compare/serve
├── cobre-mcp/          Binary: MCP server for AI agent integration
├── cobre-python/       cdylib: PyO3 Python bindings
└── cobre-tui/          Library: ratatui terminal UI

Dependency Graph

The diagram below shows the primary dependency relationships between workspace crates. Arrows point from dependency to dependent (i.e., an arrow from cobre-core to cobre-io means cobre-io depends on cobre-core).

graph TD
    core[cobre-core]
    io[cobre-io]
    solver[cobre-solver]
    comm[cobre-comm]
    stochastic[cobre-stochastic]
    sddp[cobre-sddp]
    cli[cobre-cli]
    ferrompi[ferrompi]

    core --> io
    core --> stochastic
    core --> solver
    core --> comm
    ferrompi --> comm
    io --> sddp
    solver --> sddp
    comm --> sddp
    stochastic --> sddp
    sddp --> cli

For the full dependency graph and crate responsibilities, see the methodology reference.

cobre-core

experimental

cobre-core is the shared data model for the Cobre ecosystem. It defines the fundamental entity types used across all crates: buses, transmission lines, hydro plants, thermal units, energy contracts, pumping stations, and non-controllable sources. Every other Cobre crate consumes cobre-core types by shared reference; no crate other than cobre-io constructs System values.

The crate has no solver, optimizer, or I/O dependencies. It holds pure data structures, the System container that groups them, derived topology graphs, penalty resolution utilities, temporal types, scenario pipeline types, initial conditions, generic constraints, and pre-resolved penalty/bound tables.

Module overview

ModulePurpose
entitiesEntity types: Bus, Line, Hydro, Thermal, and stub types
entity_idEntityId newtype wrapper
errorValidationError enum
generic_constraintUser-defined linear constraints over LP variables
initial_conditionsReservoir storage levels at study start
penaltyGlobal defaults, entity overrides, and resolution functions
resolvedPre-resolved penalty/bound tables with O(1) lookup
scenarioPAR model parameters, load statistics, and correlation model
systemSystem container and SystemBuilder
temporalStages, blocks, seasons, and the policy graph
topologyCascadeTopology and NetworkTopology derived structures

Design principles

Clarity-first representation. cobre-core stores entities in the form most readable to a human engineer: nested JSON concepts are flattened into named fields with explicit unit suffixes, optional sub-models appear as Option<Enum> variants, and every f64 field carries a unit in its name and doc comment. Performance-adapted views (packed arrays, LP variable indices) live in downstream solver crates, not here.

Validate at construction. The SystemBuilder catches invalid states during construction – duplicate IDs, broken cross-references, cascade cycles, and invalid filling configurations – so the rest of the system receives a structurally sound System with no need for defensive checks at solve time.

Declaration-order invariance. Entity collections are stored in canonical ID-sorted order. Any System built from the same entities produces bit-for-bit identical results regardless of the order in which entities were supplied to SystemBuilder. Integration tests verify this property explicitly.

Thread-safe and immutable after construction. System is Send + Sync. After SystemBuilder::build() returns Ok, the System is immutable and can be shared across threads without synchronization.

Entity types

Fully modeled entities

These four entity types contribute LP variables and constraints in optimization and simulation procedures.

Bus

An electrical network node where power balance is maintained.

FieldTypeDescription
idEntityIdUnique bus identifier
nameStringHuman-readable name
deficit_segmentsVec<DeficitSegment>Pre-resolved piecewise-linear deficit cost curve
excess_costf64Cost per MWh for surplus generation absorption

DeficitSegment has two fields: depth_mw: Option<f64> (the MW capacity of the segment; None for the final unbounded segment) and cost_per_mwh: f64 (the marginal cost in that segment). Segments are ordered by ascending cost. The final segment always has depth_mw = None to ensure LP feasibility.

Line

A transmission interconnection between two buses.

FieldTypeDescription
idEntityIdUnique line identifier
nameStringHuman-readable name
source_bus_idEntityIdSource bus for the direct flow direction
target_bus_idEntityIdTarget bus for the direct flow direction
entry_stage_idOption<i32>Stage when line enters service; None = always
exit_stage_idOption<i32>Stage when line is retired; None = never
direct_capacity_mwf64Maximum MW flow from source to target
reverse_capacity_mwf64Maximum MW flow from target to source
losses_percentf64Transmission losses as a percentage
exchange_costf64Regularization cost per MWh exchanged

Line flow is a hard constraint; the exchange_cost is a regularization term, not a violation penalty.

Thermal

A thermal power plant with a piecewise-linear generation cost curve.

FieldTypeDescription
idEntityIdUnique thermal plant identifier
nameStringHuman-readable name
bus_idEntityIdBus receiving this plant’s generation
entry_stage_idOption<i32>Stage when plant enters service; None = always
exit_stage_idOption<i32>Stage when plant is retired; None = never
cost_segmentsVec<ThermalCostSegment>Piecewise-linear cost curve, ascending cost order
min_generation_mwf64Minimum stable load
max_generation_mwf64Installed capacity
gnl_configOption<GnlConfig>GNL dispatch anticipation; None = no lag

ThermalCostSegment holds capacity_mw: f64 and cost_per_mwh: f64. GnlConfig holds lag_stages: i32 (number of stages of dispatch anticipation for liquefied natural gas units that require advance scheduling).

Hydro

The most complex entity type: a hydroelectric plant with a reservoir, turbines, and optional cascade connectivity. It has 22 fields.

Identity and connectivity:

FieldTypeDescription
idEntityIdUnique plant identifier
nameStringHuman-readable name
bus_idEntityIdBus receiving this plant’s electrical generation
downstream_idOption<EntityId>Downstream plant in cascade; None = terminal node
entry_stage_idOption<i32>Stage when plant enters service; None = always
exit_stage_idOption<i32>Stage when plant is retired; None = never

Reservoir and outflow:

FieldTypeDescription
min_storage_hm3f64Minimum operational storage (dead volume)
max_storage_hm3f64Maximum operational storage (flood control level)
min_outflow_m3sf64Minimum total outflow at all times
max_outflow_m3sOption<f64>Maximum total outflow; None = no upper bound

Turbine:

FieldTypeDescription
generation_modelHydroGenerationModelProduction function variant
min_turbined_m3sf64Minimum turbined flow
max_turbined_m3sf64Maximum turbined flow (installed turbine capacity)
min_generation_mwf64Minimum electrical generation
max_generation_mwf64Maximum electrical generation (installed capacity)

Optional hydraulic sub-models:

FieldTypeDescription
tailraceOption<TailraceModel>Downstream water level model; None = zero
hydraulic_lossesOption<HydraulicLossesModel>Penstock loss model; None = lossless
efficiencyOption<EfficiencyModel>Turbine efficiency model; None = 100%
evaporation_coefficients_mmOption<[f64; 12]>Monthly evaporation [mm/month]; None = no evaporation
diversionOption<DiversionChannel>Diversion channel; None = no diversion
fillingOption<FillingConfig>Filling operation config; None = no filling

Penalties:

FieldTypeDescription
penaltiesHydroPenaltiesPre-resolved penalty costs from the global-entity cascade

Stub entities

These three entity types are data-complete but do not contribute LP variables or constraints in the minimal viable implementation. Their type definitions exist in the registry so analysis code can iterate over all entity types uniformly.

PumpingStation

Transfers water between hydro reservoirs while consuming electrical power. Fields: id, name, bus_id, source_hydro_id, destination_hydro_id, entry_stage_id, exit_stage_id, consumption_mw_per_m3s, min_flow_m3s, max_flow_m3s.

EnergyContract

A bilateral energy agreement with an entity outside the modeled system. Fields: id, name, bus_id, contract_type (ContractType::Import or ContractType::Export), entry_stage_id, exit_stage_id, price_per_mwh, min_mw, max_mw. Negative price_per_mwh represents export revenue.

NonControllableSource

Intermittent generation (wind, solar, run-of-river) that cannot be dispatched. Fields: id, name, bus_id, entry_stage_id, exit_stage_id, max_generation_mw, curtailment_cost (pre-resolved).

Supporting types

Enums

EnumVariantsPurpose
HydroGenerationModelConstantProductivity { productivity_mw_per_m3s }, LinearizedHead { productivity_mw_per_m3s }, FphaProduction function for turbine power computation
TailraceModelPolynomial { coefficients: Vec<f64> }, Piecewise { points: Vec<TailracePoint> }Downstream water level as a function of total outflow
HydraulicLossesModelFactor { value }, Constant { value_m }Head loss in penstock and draft tube
EfficiencyModelConstant { value }Turbine-generator efficiency
ContractTypeImport, ExportEnergy flow direction for bilateral contracts

ConstantProductivity is used universally and is the minimal viable model. LinearizedHead is for high-fidelity analyses where head-dependent terms matter. Fpha is the full production function with head-area-productivity tables for detailed modeling.

Structs

StructFieldsPurpose
TailracePointoutflow_m3s: f64, height_m: f64One breakpoint on a piecewise tailrace curve
DeficitSegmentdepth_mw: Option<f64>, cost_per_mwh: f64One segment of a piecewise deficit cost curve
ThermalCostSegmentcapacity_mw: f64, cost_per_mwh: f64One segment of a thermal generation cost curve
GnlConfiglag_stages: i32Dispatch anticipation lag for GNL thermal units
DiversionChanneldownstream_id: EntityId, max_flow_m3s: f64Water diversion bypassing turbines and spillways
FillingConfigstart_stage_id: i32, filling_inflow_m3s: f64Reservoir filling operation from a fixed inflow source
HydroPenalties11 f64 fields (see Penalty resolution section)Pre-resolved penalty costs for one hydro plant

EntityId

EntityId is a newtype wrapper around i32:

#![allow(unused)]
fn main() {
#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)]
pub struct EntityId(pub i32);
}

Why i32, not String. All JSON entity schemas use integer IDs. Integer keys are cheaper to hash, compare, and copy than strings. EntityId appears in every lookup index and cross-reference field, so this is a high-frequency type. If a future input format requires string IDs, the newtype boundary isolates the change to EntityId’s internal representation and its From/Into impls.

Why no Ord. Entity ordering is always by inner i32 value (canonical ID order), but the spec deliberately omits Ord to prevent accidental use of lexicographic ordering in contexts that expect ID-based ordering. Sort sites use sort_by_key(|e| e.id.0) explicitly, making the intent visible at each call site.

Construction and conversion:

#![allow(unused)]
fn main() {
use cobre_core::EntityId;

let id: EntityId = EntityId::from(42);
let raw: i32 = i32::from(id);
assert_eq!(id.to_string(), "42");
}

System and SystemBuilder

System is the top-level in-memory representation of a validated, resolved case. It is produced by SystemBuilder (directly in tests) and by cobre-io::load_case() in production. It is consumed read-only by downstream solver and analysis crates.

#![allow(unused)]
fn main() {
use cobre_core::{Bus, DeficitSegment, EntityId, SystemBuilder};

let system = SystemBuilder::new()
    .buses(vec![Bus {
        id: EntityId(1),
        name: "Main Bus".to_string(),
        deficit_segments: vec![],
        excess_cost: 0.0,
    }])
    .build()
    .expect("valid system");

assert_eq!(system.n_buses(), 1);
assert!(system.bus(EntityId(1)).is_some());
}

Validation in SystemBuilder::build()

SystemBuilder::build() runs four validation phases in order:

  1. Duplicate check. Each of the 7 entity collections is scanned for duplicate EntityId values. All collections are checked before returning. If any duplicates are found, build() returns early with the error list.

  2. Cross-reference validation. Every foreign-key field is verified against the appropriate collection index. Checked fields include bus_id on hydros, thermals, pumping stations, energy contracts, and non-controllable sources; source_bus_id and target_bus_id on lines; downstream_id and diversion.downstream_id on hydros; and source_hydro_id and destination_hydro_id on pumping stations. All broken references across all entity types are collected; build() returns early after this phase if any are found.

  3. Cascade topology and cycle detection. CascadeTopology is built from the validated hydro downstream_id fields. If the topological sort (Kahn’s algorithm) does not reach all hydros, the unvisited hydros form a cycle. Their IDs are reported in a ValidationError::CascadeCycle error. Filling configurations are also validated in this phase.

  4. Filling config validation. Each hydro with a FillingConfig must have a positive filling_inflow_m3s and a non-None entry_stage_id. Violations produce ValidationError::InvalidFillingConfig errors.

If all phases pass, build() constructs NetworkTopology, builds O(1) lookup indices for all 7 collections, and returns the immutable System.

The build() signature collects and returns all errors found across all collections rather than short-circuiting on the first failure:

#![allow(unused)]
fn main() {
pub fn build(self) -> Result<System, Vec<ValidationError>>
}

Canonical ordering

Before building indices, SystemBuilder::build() sorts every entity collection by entity.id.0. The resulting System stores entities in this canonical order. All accessor methods (buses(), hydros(), etc.) return slices in canonical order. This guarantees declaration-order invariance: two System values built from the same entities in different input orders are structurally identical.

Topology

CascadeTopology

CascadeTopology represents the directed forest of hydro plant cascade relationships. It is built from the downstream_id fields of all hydro plants and stored on System.

#![allow(unused)]
fn main() {
let cascade = system.cascade();

// Downstream plant for a given hydro (None if terminal).
let ds: Option<EntityId> = cascade.downstream(EntityId(1));

// All upstream plants for a given hydro (empty slice if headwater).
let upstream: &[EntityId] = cascade.upstream(EntityId(3));

// Topological ordering: every upstream plant appears before its downstream.
let order: &[EntityId] = cascade.topological_order();

cascade.is_headwater(EntityId(1)); // true if no upstream plants
cascade.is_terminal(EntityId(3));  // true if no downstream plant
}

The topological order is computed using Kahn’s algorithm with a sorted ready queue, ensuring determinism: within the same topological level, hydros appear in ascending ID order.

NetworkTopology

NetworkTopology provides O(1) lookups for bus-line incidence and bus-to-entity maps. It is built from all entity collections and stored on System.

#![allow(unused)]
fn main() {
let network = system.network();

// Lines connected to a bus.
let connections: &[BusLineConnection] = network.bus_lines(EntityId(1));
// BusLineConnection has `line_id: EntityId` and `is_source: bool`.

// Generators connected to a bus.
let generators: &BusGenerators = network.bus_generators(EntityId(1));
// BusGenerators has `hydro_ids`, `thermal_ids`, `ncs_ids` (all Vec<EntityId>).

// Load entities connected to a bus.
let loads: &BusLoads = network.bus_loads(EntityId(1));
// BusLoads has `contract_ids` and `pumping_station_ids` (both Vec<EntityId>).
}

All ID lists in BusGenerators and BusLoads are in canonical ascending-ID order for determinism.

Penalty resolution

Penalty values are resolved from a three-tier cascade: global defaults, entity-level overrides, and stage-level overrides. The first two tiers are implemented in Phase 1. Stage-varying overrides are deferred to Phase 2.

GlobalPenaltyDefaults holds system-wide fallback values for all penalty fields:

#![allow(unused)]
fn main() {
pub struct GlobalPenaltyDefaults {
    pub bus_deficit_segments: Vec<DeficitSegment>,
    pub bus_excess_cost: f64,
    pub line_exchange_cost: f64,
    pub hydro: HydroPenalties,
    pub ncs_curtailment_cost: f64,
}
}

The five resolution functions each accept an optional entity-level override and the global defaults, returning the resolved value:

#![allow(unused)]
fn main() {
// Returns entity segments if present, else global defaults.
let segments = resolve_bus_deficit_segments(&entity_override, &global);

// Returns entity value if Some, else global default.
let cost    = resolve_bus_excess_cost(entity_override, &global);
let cost    = resolve_line_exchange_cost(entity_override, &global);
let cost    = resolve_ncs_curtailment_cost(entity_override, &global);

// Resolves all 11 hydro penalty fields field-by-field.
let hydro_p = resolve_hydro_penalties(&entity_overrides, &global);
}

HydroPenalties holds 11 pre-resolved f64 fields:

FieldUnitDescription
spillage_cost$/m³/sPenalty per m³/s of spillage
diversion_cost$/m³/sPenalty per m³/s exceeding diversion channel limit
fpha_turbined_cost$/MWhRegularization cost for FPHA turbined flow
storage_violation_below_cost$/hm³Penalty per hm³ of storage below minimum
filling_target_violation_cost$/hm³Penalty per hm³ below filling target
turbined_violation_below_cost$/m³/sPenalty per m³/s of turbined flow below minimum
outflow_violation_below_cost$/m³/sPenalty per m³/s of total outflow below minimum
outflow_violation_above_cost$/m³/sPenalty per m³/s of total outflow above maximum
generation_violation_below_cost$/MWPenalty per MW of generation below minimum
evaporation_violation_cost$/mmPenalty per mm of evaporation constraint violation
water_withdrawal_violation_cost$/m³/sPenalty per m³/s of water withdrawal violation

The optional HydroPenaltyOverrides struct mirrors HydroPenalties with all fields as Option<f64>. It is an intermediate type used during case loading; the resolved HydroPenalties (with no Options) is what is stored on each Hydro entity.

Validation errors

ValidationError is the error type returned by SystemBuilder::build():

VariantMeaning
DuplicateIdTwo entities in the same collection share an EntityId
InvalidReferenceA cross-reference field points to an ID that does not exist
CascadeCycleThe hydro downstream_id graph contains a cycle
InvalidFillingConfigA hydro’s filling configuration has non-positive inflow or no entry_stage_id
DisconnectedBusA bus has no lines, generators, or loads (reserved for Phase 2 validation)
InvalidPenaltyAn entity-level penalty value is invalid (e.g., negative cost)

All variants implement Display and the standard Error trait. The error message includes the entity type, the offending ID, and (for reference errors) the field name and the missing referenced ID.

#![allow(unused)]
fn main() {
use cobre_core::{EntityId, ValidationError};

let err = ValidationError::InvalidReference {
    source_entity_type: "Hydro",
    source_id: EntityId(3),
    field_name: "bus_id",
    referenced_id: EntityId(99),
    expected_type: "Bus",
};
// "Hydro with id 3 has invalid cross-reference in field 'bus_id': referenced Bus id 99 does not exist"
println!("{err}");
}

Temporal model

The temporal module defines the time structure of a multi-stage stochastic optimization problem. These types are loaded from stages.json by cobre-io and stored on System.

There are 13 types in total: 5 enums and 8 structs.

Enums

EnumVariantsPurpose
BlockModeParallel, ChronologicalHow blocks within a stage relate in the LP
SeasonCycleTypeMonthly, Weekly, CustomHow season IDs map to calendar periods
NoiseMethodSaa, Lhs, QmcSobol, QmcHalton, SelectiveOpening tree noise generation algorithm
PolicyGraphTypeFiniteHorizon, CyclicWhether the study horizon is acyclic or infinite-periodic
StageRiskConfigExpectation, CVaR { alpha, lambda }Per-stage risk measure configuration

BlockMode::Parallel is the default: blocks are independent sub-periods solved simultaneously, with water balance aggregated across all blocks in the stage. BlockMode::Chronological enables intra-stage storage dynamics (daily cycling).

PolicyGraphType::FiniteHorizon is the minimal viable solver choice: an acyclic stage chain with zero terminal value. Cyclic requires a positive annual_discount_rate for convergence.

Block

A load block within a stage, representing a sub-period with uniform demand and generation characteristics.

FieldTypeDescription
indexusize0-based index within the parent stage (0, 1, …, n-1)
nameStringHuman-readable block label (e.g., “PEAK”, “OFF-PEAK”)
duration_hoursf64Duration of this block in hours; must be positive

The block weight (fraction of stage duration) is derived on demand as duration_hours / sum(all block hours in stage) and is not stored.

StageStateConfig

Flags controlling which variables carry state between stages.

FieldTypeDefaultDescription
storagebooltrueWhether reservoir storage volumes are state variables
inflow_lagsboolfalseWhether past inflow realizations (AR lags) are state variables

inflow_lags must be true when the PAR model order p > 0 and inflow lag cuts are enabled.

ScenarioSourceConfig

Per-stage scenario generation configuration.

FieldTypeDescription
branching_factorusizeNumber of noise realizations per stage; must be positive
noise_methodNoiseMethodAlgorithm for generating noise vectors in the opening tree

branching_factor is the per-stage branching factor for both the opening tree and the forward pass. noise_method is orthogonal to SamplingScheme (which selects the forward-pass noise source); it governs how the backward-pass opening tree is produced.

Stage

A single stage in the multi-stage stochastic problem, partitioning the study horizon into decision periods.

FieldTypeDescription
indexusize0-based array position after canonical sort
idi32Domain-level identifier from stages.json; negative = pre-study
start_dateNaiveDateStage start date (inclusive), ISO 8601
end_dateNaiveDateStage end date (exclusive), ISO 8601
season_idOption<usize>Index into SeasonMap::seasons; None = no seasonal structure
blocksVec<Block>Ordered load blocks; sum of duration_hours = stage duration
block_modeBlockModeParallel or chronological block formulation
state_configStageStateConfigState variable flags
risk_configStageRiskConfigRisk measure for this stage
scenario_configScenarioSourceConfigBranching factor and noise method

Pre-study stages (negative id) carry only id, start_date, end_date, and season_id. Their blocks, risk_config, and scenario_config fields are unused.

#![allow(unused)]
fn main() {
use chrono::NaiveDate;
use cobre_core::temporal::{
    Block, BlockMode, NoiseMethod, ScenarioSourceConfig, Stage,
    StageRiskConfig, StageStateConfig,
};

let stage = Stage {
    index: 0,
    id: 1,
    start_date: NaiveDate::from_ymd_opt(2024, 1, 1).unwrap(),
    end_date:   NaiveDate::from_ymd_opt(2024, 2, 1).unwrap(),
    season_id:  Some(0),
    blocks: vec![Block {
        index: 0,
        name: "SINGLE".to_string(),
        duration_hours: 744.0,
    }],
    block_mode: BlockMode::Parallel,
    state_config: StageStateConfig { storage: true, inflow_lags: false },
    risk_config: StageRiskConfig::Expectation,
    scenario_config: ScenarioSourceConfig {
        branching_factor: 50,
        noise_method: NoiseMethod::Saa,
    },
};
}

SeasonDefinition and SeasonMap

Season definitions map season IDs to calendar periods for PAR model coefficient lookup and inflow history aggregation.

SeasonDefinition fields:

FieldTypeDescription
idusize0-based season index (0-11 for monthly, 0-51 for weekly)
labelStringHuman-readable label (e.g., “January”, “Wet Season”)
month_startu32Calendar month where the season starts (1-12)
day_startOption<u32>Calendar day start; only used for Custom cycle type
month_endOption<u32>Calendar month end; only used for Custom cycle type
day_endOption<u32>Calendar day end; only used for Custom cycle type

SeasonMap groups the definitions with a cycle type:

FieldTypeDescription
cycle_typeSeasonCycleTypeMonthly (12 seasons), Weekly (52 seasons), or Custom
seasonsVec<SeasonDefinition>Season entries sorted by id

Transition and PolicyGraph

Transition represents a directed edge in the policy graph:

FieldTypeDescription
source_idi32Source stage ID
target_idi32Target stage ID
probabilityf64Transition probability; outgoing probabilities must sum to 1.0
annual_discount_rate_overrideOption<f64>Per-transition rate override; None = use global rate

PolicyGraph is the top-level clarity-first representation of the stage graph loaded from stages.json:

FieldTypeDescription
graph_typePolicyGraphTypeFiniteHorizon (acyclic) or Cyclic (infinite periodic)
annual_discount_ratef64Global discount rate; 0.0 = no discounting
transitionsVec<Transition>Stage transitions forming a linear chain or DAG
season_mapOption<SeasonMap>Season definitions; None when no seasonal structure is needed

For finite horizon, transitions form a linear chain. For cyclic horizon, at least one transition has source_id >= target_id (a back-edge) and the annual_discount_rate must be positive for convergence.

#![allow(unused)]
fn main() {
use cobre_core::temporal::{PolicyGraph, PolicyGraphType, Transition};

let graph = PolicyGraph {
    graph_type: PolicyGraphType::FiniteHorizon,
    annual_discount_rate: 0.06,
    transitions: vec![
        Transition { source_id: 1, target_id: 2, probability: 1.0,
                     annual_discount_rate_override: None },
        Transition { source_id: 2, target_id: 3, probability: 1.0,
                     annual_discount_rate_override: Some(0.08) },
    ],
    season_map: None,
};
assert_eq!(graph.graph_type, PolicyGraphType::FiniteHorizon);
}

The solver-level HorizonMode enum in cobre-sddp is built from a PolicyGraph at initialization time; it precomputes transition maps, cycle detection, and discount factors for efficient runtime dispatch. The PolicyGraph in cobre-core is the user-facing clarity-first representation.

Scenario pipeline types

The scenario module holds clarity-first data containers for the raw scenario pipeline parameters loaded from input files. These are raw input-facing types; performance-adapted views (pre-computed LP arrays, Cholesky-decomposed matrices) belong in downstream crates (cobre-stochastic, cobre-sddp).

SamplingScheme and ScenarioSource

SamplingScheme selects the forward-pass noise source:

VariantDescription
InSampleForward pass reuses the opening tree generated for the backward pass
ExternalForward pass draws from an externally supplied scenario file
HistoricalForward pass replays historical inflow realizations

InSample is the default and the minimal viable solver choice.

ScenarioSource is the top-level scenario configuration loaded from stages.json:

FieldTypeDescription
sampling_schemeSamplingSchemeNoise source for the forward pass
seedOption<i64>Random seed for reproducible generation; None = OS entropy
selection_modeOption<ExternalSelectionMode>Only used when sampling_scheme is External

ExternalSelectionMode has two variants: Random (draw uniformly at random) and Sequential (replay in file order, cycling when the end is reached).

InflowModel

Raw PAR(p) model parameters for a single (hydro, stage) pair, loaded from inflow_seasonal_stats.parquet and inflow_ar_coefficients.parquet.

FieldTypeDescription
hydro_idEntityIdHydro plant this model belongs to
stage_idi32Stage index this model applies to
mean_m3sf64Seasonal mean inflow μ [m³/s]
std_m3sf64Seasonal standard deviation σ [m³/s]
ar_orderusizeAR model order p; zero means white-noise inflow
ar_coefficientsVec<f64>AR lag coefficients [ψ₁, ψ₂, …, ψₚ]; length = ar_order
#![allow(unused)]
fn main() {
use cobre_core::{EntityId, scenario::InflowModel};

let model = InflowModel {
    hydro_id: EntityId(1),
    stage_id: 3,
    mean_m3s: 150.0,
    std_m3s: 30.0,
    ar_order: 2,
    ar_coefficients: vec![0.45, 0.22],
};
assert_eq!(model.ar_order, 2);
assert_eq!(model.ar_coefficients.len(), 2);
}

System holds a Vec<InflowModel> sorted by (hydro_id, stage_id) for declaration-order invariance.

LoadModel

Raw load seasonal statistics for a single (bus, stage) pair, loaded from load_seasonal_stats.parquet.

FieldTypeDescription
bus_idEntityIdBus this load model belongs to
stage_idi32Stage index this model applies to
mean_mwf64Seasonal mean load demand [MW]
std_mwf64Seasonal standard deviation of load demand [MW]

Load typically has no AR structure, so no lag coefficients are stored. System holds a Vec<LoadModel> sorted by (bus_id, stage_id).

CorrelationModel

CorrelationModel is the top-level correlation configuration loaded from correlation.json. It holds named profiles and an optional stage-to-profile schedule.

The type hierarchy is:

CorrelationModel
  └── profiles: BTreeMap<String, CorrelationProfile>
        └── groups: Vec<CorrelationGroup>
              ├── entities: Vec<CorrelationEntity>
              └── matrix: Vec<Vec<f64>>   (symmetric, row-major)

CorrelationEntity carries entity_type: String (currently always "inflow") and id: EntityId. Using String rather than an enum preserves forward compatibility when additional stochastic variable types are added.

profiles uses BTreeMap rather than HashMap to preserve deterministic iteration order (declaration-order invariance). Cholesky decomposition of the correlation matrices is NOT performed here; that belongs to cobre-stochastic.

#![allow(unused)]
fn main() {
use std::collections::BTreeMap;
use cobre_core::{EntityId, scenario::{
    CorrelationEntity, CorrelationGroup, CorrelationModel, CorrelationProfile,
}};

let mut profiles = BTreeMap::new();
profiles.insert("default".to_string(), CorrelationProfile {
    groups: vec![CorrelationGroup {
        name: "All".to_string(),
        entities: vec![
            CorrelationEntity { entity_type: "inflow".to_string(), id: EntityId(1) },
            CorrelationEntity { entity_type: "inflow".to_string(), id: EntityId(2) },
        ],
        matrix: vec![vec![1.0, 0.8], vec![0.8, 1.0]],
    }],
});

let model = CorrelationModel {
    method: "cholesky".to_string(),
    profiles,
    schedule: vec![],
};
assert!(model.profiles.contains_key("default"));
}

When schedule is empty, a single profile (typically named "default") applies to all stages. When schedule is non-empty, each entry maps a stage index to an active profile name.

Initial conditions and constraints

InitialConditions

InitialConditions holds the reservoir storage levels at the start of the study. It is loaded from initial_conditions.json by cobre-io and stored on System.

Two arrays are kept separate because filling hydros can have an initial volume below dead storage (min_storage_hm3), which is not a valid operating level for regular hydros:

FieldTypeDescription
storageVec<HydroStorage>Initial storage for operating hydros [hm³]
filling_storageVec<HydroStorage>Initial storage for filling hydros [hm³]; below dead volume

HydroStorage carries hydro_id: EntityId and value_hm3: f64. A hydro must appear in exactly one of the two arrays. Both arrays are sorted by hydro_id after loading for declaration-order invariance.

#![allow(unused)]
fn main() {
use cobre_core::{EntityId, InitialConditions, HydroStorage};

let ic = InitialConditions {
    storage: vec![
        HydroStorage { hydro_id: EntityId(0), value_hm3: 15_000.0 },
        HydroStorage { hydro_id: EntityId(1), value_hm3:  8_500.0 },
    ],
    filling_storage: vec![
        HydroStorage { hydro_id: EntityId(10), value_hm3: 200.0 },
    ],
};

assert_eq!(ic.storage.len(), 2);
assert_eq!(ic.filling_storage.len(), 1);
}

GenericConstraint

GenericConstraint represents a user-defined linear constraint over LP variables, loaded from generic_constraints.json and stored in System::generic_constraints. The expression parser (string to ConstraintExpression) and referential validation live in cobre-io, not here.

FieldTypeDescription
idEntityIdUnique constraint identifier
nameStringShort name used in reports and log output
descriptionOption<String>Optional human-readable description
expressionConstraintExpressionParsed left-hand-side linear expression
senseConstraintSenseComparison sense: GreaterEqual, LessEqual, Equal
slackSlackConfigSlack variable configuration

ConstraintExpression holds a Vec<LinearTerm>. Each LinearTerm has a coefficient: f64 and a variable: VariableRef.

VariableRef

VariableRef is an enum with 19 variants covering all LP variable types defined in the data model. Each variant names the variable type and carries the entity ID. For block-specific variables, block_id is None to sum over all blocks or Some(i) to reference block i specifically.

CategoryVariants
HydroHydroStorage, HydroTurbined, HydroSpillage, HydroDiversion, HydroOutflow, HydroGeneration, HydroEvaporation, HydroWithdrawal
ThermalThermalGeneration
LineLineDirect, LineReverse
BusBusDeficit, BusExcess
PumpingPumpingFlow, PumpingPower
ContractContractImport, ContractExport
NCSNonControllableGeneration, NonControllableCurtailment

HydroStorage, HydroEvaporation, and HydroWithdrawal are stage-level variables (no block_id). All other hydro variables and all thermal, line, bus, pumping, contract, and NCS variables are block-specific (block_id field present).

SlackConfig

Controls whether a soft constraint with a penalty cost is added to the LP:

FieldTypeDescription
enabledboolIf true, adds a slack variable allowing constraint violation
penaltyOption<f64>Penalty per unit of violation; must be Some(positive) if enabled
#![allow(unused)]
fn main() {
use cobre_core::{
    EntityId, GenericConstraint, ConstraintExpression, ConstraintSense,
    LinearTerm, SlackConfig, VariableRef,
};

let expr = ConstraintExpression {
    terms: vec![
        LinearTerm {
            coefficient: 1.0,
            variable: VariableRef::HydroGeneration {
                hydro_id: EntityId(10),
                block_id: None,   // sum over all blocks
            },
        },
        LinearTerm {
            coefficient: 1.0,
            variable: VariableRef::HydroGeneration {
                hydro_id: EntityId(11),
                block_id: None,
            },
        },
    ],
};

let gc = GenericConstraint {
    id: EntityId(0),
    name: "min_hydro_total".to_string(),
    description: Some("Minimum total hydro generation".to_string()),
    expression: expr,
    sense: ConstraintSense::GreaterEqual,
    slack: SlackConfig { enabled: true, penalty: Some(5_000.0) },
};

assert_eq!(gc.expression.terms.len(), 2);
}

Resolved penalties and bounds

The resolved module holds pre-resolved penalty and bound tables that provide O(1) lookup for LP builders and solvers.

Design: flat Vec with 2D indexing

During input loading, the three-tier cascade (global defaults -> entity overrides -> stage overrides) is evaluated once by cobre-io. The results are stored in flat Vec<T> arrays with manual 2D indexing:

data[entity_idx * n_stages + stage_idx]

This layout gives cache-friendly sequential access when iterating over stages for a fixed entity (the common inner loop pattern in LP construction). No re-evaluation of the cascade is ever required at solve time; every penalty or bound lookup is a single array index operation.

ResolvedPenalties

ResolvedPenalties holds per-(entity, stage) penalty values for all four entity types that carry stage-varying penalties: hydros, buses, lines, and non-controllable sources.

Per-(entity, stage) penalty structs:

StructFieldsDescription
HydroStagePenalties11 f64 fieldsAll hydro penalty costs for one (hydro, stage) pair
BusStagePenaltiesexcess_cost: f64Bus excess cost for one (bus, stage) pair
LineStagePenaltiesexchange_cost: f64Line flow regularization cost for one (line, stage) pair
NcsStagePenaltiescurtailment_cost: f64NCS curtailment cost for one (ncs, stage) pair

Bus deficit segments are NOT stage-varying. The piecewise-linear deficit structure is fixed at the entity or global level, so BusStagePenalties contains only excess_cost.

All four per-stage penalty structs implement Copy, so they can be passed by value on hot paths.

#![allow(unused)]
fn main() {
use cobre_core::resolved::{
    BusStagePenalties, HydroStagePenalties, LineStagePenalties,
    NcsStagePenalties, ResolvedPenalties,
};

// Allocate a 3-hydro, 2-bus, 1-line, 1-ncs table for 5 stages.
let table = ResolvedPenalties::new(
    3, 2, 1, 1, 5,
    HydroStagePenalties { spillage_cost: 0.01, diversion_cost: 0.02,
                          fpha_turbined_cost: 0.03,
                          storage_violation_below_cost: 1000.0,
                          filling_target_violation_cost: 5000.0,
                          turbined_violation_below_cost: 500.0,
                          outflow_violation_below_cost: 500.0,
                          outflow_violation_above_cost: 500.0,
                          generation_violation_below_cost: 500.0,
                          evaporation_violation_cost: 500.0,
                          water_withdrawal_violation_cost: 500.0 },
    BusStagePenalties { excess_cost: 100.0 },
    LineStagePenalties { exchange_cost: 5.0 },
    NcsStagePenalties { curtailment_cost: 50.0 },
);

// O(1) lookup: hydro 1, stage 3
let p = table.hydro_penalties(1, 3);
assert!((p.spillage_cost - 0.01).abs() < f64::EPSILON);
}

ResolvedBounds

ResolvedBounds holds per-(entity, stage) bound values for five entity types: hydros, thermals, lines, pumping stations, and energy contracts.

Per-(entity, stage) bound structs:

StructFieldsDescription
HydroStageBounds11 fields (see table below)All hydro bounds for one (hydro, stage) pair
ThermalStageBoundsmin_generation_mw, max_generation_mwThermal generation bounds [MW]
LineStageBoundsdirect_mw, reverse_mwTransmission capacity bounds [MW]
PumpingStageBoundsmin_flow_m3s, max_flow_m3sPumping flow bounds [m³/s]
ContractStageBoundsmin_mw, max_mw, price_per_mwhContract bounds [MW] and effective price

HydroStageBounds has 11 fields:

FieldUnitDescription
min_storage_hm3hm³Dead volume (soft lower bound)
max_storage_hm3hm³Physical reservoir capacity (hard upper bound)
min_turbined_m3sm³/sMinimum turbined flow (soft lower bound)
max_turbined_m3sm³/sMaximum turbined flow (hard upper bound)
min_outflow_m3sm³/sEnvironmental flow requirement (soft lower bound)
max_outflow_m3sm³/sFlood-control limit (soft upper bound); None = unbounded
min_generation_mwMWMinimum electrical generation (soft lower bound)
max_generation_mwMWMaximum electrical generation (hard upper bound)
max_diversion_m3sm³/sDiversion channel capacity (hard upper bound); None = no diversion
filling_inflow_m3sm³/sFilling inflow retained during filling stages; default 0.0
water_withdrawal_m3sm³/sWater withdrawal per stage; positive = removed, negative = added
#![allow(unused)]
fn main() {
use cobre_core::resolved::{
    ContractStageBounds, HydroStageBounds, LineStageBounds,
    PumpingStageBounds, ResolvedBounds, ThermalStageBounds,
};

// Allocate a table for 2 hydros, 1 thermal, 1 line, 0 pumping, 0 contracts, 3 stages.
let table = ResolvedBounds::new(
    2, 1, 1, 0, 0, 3,
    HydroStageBounds { min_storage_hm3: 10.0, max_storage_hm3: 200.0,
                       min_turbined_m3s: 0.0,  max_turbined_m3s: 500.0,
                       min_outflow_m3s: 5.0,   max_outflow_m3s: None,
                       min_generation_mw: 0.0, max_generation_mw: 100.0,
                       max_diversion_m3s: None,
                       filling_inflow_m3s: 0.0, water_withdrawal_m3s: 0.0 },
    ThermalStageBounds { min_generation_mw: 50.0, max_generation_mw: 400.0 },
    LineStageBounds { direct_mw: 1000.0, reverse_mw: 800.0 },
    PumpingStageBounds { min_flow_m3s: 0.0, max_flow_m3s: 0.0 },
    ContractStageBounds { min_mw: 0.0, max_mw: 0.0, price_per_mwh: 0.0 },
);

// O(1) lookup: hydro 0, stage 2
let b = table.hydro_bounds(0, 2);
assert!((b.max_storage_hm3 - 200.0).abs() < f64::EPSILON);
assert!(b.max_outflow_m3s.is_none());
}

Both tables expose _mut accessor variants (e.g., hydro_penalties_mut, hydro_bounds_mut) that return &mut T for in-place updates during case loading. These are used exclusively by cobre-io; all other crates use the immutable read accessors.

Serde feature flag

cobre-core ships with an optional serde feature that enables serde::Serialize and serde::Deserialize for all public types. The feature is disabled by default to keep the minimal build free of serialization dependencies.

When to enable

Use caseEnable?
Reading cobre-core as a pure data model libraryNo
Building cobre-io (JSON input loading)Yes
MPI broadcast via postcard in cobre-commYes
Checkpoint serialization in cobre-sddpYes
Python bindings in cobre-pythonYes
Writing tests that inspect values as JSONYes

Enabling the feature

# Cargo.toml
[dependencies]
cobre-core = { version = "0.x", features = ["serde"] }

Or from the command line:

cargo build --features cobre-core/serde

Enabling serde also activates chrono/serde, which is required because Stage carries NaiveDate fields that must be serializable for JSON input loading and MPI broadcast.

How it works

Every public type in cobre-core carries a #[cfg_attr(feature = "serde", derive(serde::Serialize, serde::Deserialize))] attribute. When the feature is inactive, the derive is omitted entirely and the serde dependency is not compiled. There is no runtime cost and no API surface change when the feature is disabled.

All downstream Cobre crates that perform serialization declare cobre-core/serde as a required dependency. The workspace ensures that only one copy of cobre-core is compiled, with the feature union of all crates that request it.

Public API summary

System exposes four categories of methods:

Collection accessors (return &[T] in canonical ID order): buses(), lines(), hydros(), thermals(), pumping_stations(), contracts(), non_controllable_sources()

Count queries (return usize): n_buses(), n_lines(), n_hydros(), n_thermals(), n_pumping_stations(), n_contracts(), n_non_controllable_sources()

Entity lookup by ID (return Option<&T>): bus(id), line(id), hydro(id), thermal(id), pumping_station(id), contract(id), non_controllable_source(id) – each is O(1) via a HashMap<EntityId, usize> index into the canonical collection.

Topology accessors (return references to derived structures): cascade() returns &CascadeTopology, network() returns &NetworkTopology.

For full method signatures and rustdoc, run:

cargo doc --workspace --no-deps --open

For the theoretical underpinning of the entity model, generation models, and penalty system, see the methodology reference.

cobre-io

experimental

cobre-io is the case directory loader for the Cobre ecosystem. It provides the load_case function, which reads a case directory from disk and produces a fully-validated [cobre_core::System] ready for use by downstream solver and analysis crates.

The crate owns the entire input path: JSON and Parquet parsing, five layers of validation, three-tier penalty and bound resolution, and scenario model assembly. No other crate reads input files. Every crate downstream of cobre-io receives a structurally sound System with all foreign keys resolved and all domain rules verified.

Module overview

ModulePurpose
configConfig struct and parse_config — reads config.json
systemEntity parsers for buses, lines, hydros, thermals, and stub types
extensionsHydro production model extensions (FPHA hyperplanes, geometry tables)
scenariosInflow and load statistical model loading and assembly
constraintsStage-varying bound and penalty override loading from Parquet
penaltiesGlobal penalty defaults parser (penalties.json)
stagesStage sequence and policy graph loading (stages.json)
initial_conditionsReservoir initial storage loading
validationFive-layer validation pipeline and ValidationContext
resolutionThree-tier penalty and bound resolution into O(1) lookup tables
pipelineOrchestrator that wires all layers into a single load_case call
reportStructured validation report generation
broadcastSystem serialization and deserialization for MPI broadcast
outputOutput result types for simulation and training data

load_case

#![allow(unused)]
fn main() {
pub fn load_case(path: &Path) -> Result<System, LoadError>
}

Loads a power system case directory and returns a fully-validated System.

path must point to the case root directory. That directory must contain config.json, penalties.json, stages.json, initial_conditions.json, the system/ subdirectory, the scenarios/ subdirectory, and the constraints/ subdirectory. See Case directory structure for the full layout.

load_case executes the following sequence:

  1. Layer 1 — Structural validation. Checks that all required files exist on disk and records which optional files are present. Missing required files produce [LoadError::ConstraintError] entries. Missing optional files are silently noted in the file manifest without error.
  2. Layer 2 — Schema validation. Parses every present file, verifies required fields, types, and value ranges. Returns [LoadError::IoError] for read failures and [LoadError::ParseError] for malformed JSON or invalid Parquet. Schema violations produce [LoadError::ConstraintError] entries.
  3. Layer 3 — Referential integrity. Verifies that every cross-entity ID reference resolves to a known entity. Dangling foreign keys produce [LoadError::ConstraintError] entries.
  4. Layer 4 — Dimensional consistency. Checks that optional per-entity files provide coverage for every entity that needs them (for example, that inflow statistical parameters exist for every hydro plant). Coverage gaps produce [LoadError::ConstraintError] entries.
  5. Layer 5 — Semantic validation. Enforces domain business rules: acyclic hydro cascade topology, penalty ordering (lower tiers may not exceed upper), PAR model stationarity, stage count consistency. Violations produce [LoadError::ConstraintError] entries.
  6. Resolution. After all five layers pass, three-tier penalty and bound resolution is performed. The result is pre-resolved lookup tables embedded in the System for O(1) solver access.
  7. Scenario assembly. Inflow and load statistical models are assembled from the parsed seasonal statistics and autoregressive coefficients.
  8. System construction. SystemBuilder::build() is called with the fully resolved data. Any remaining structural violations (duplicate IDs, broken cascade) surface as a final [LoadError::ConstraintError].

All validation diagnostics across Layers 1 through 5 are collected by ValidationContext before failing. When load_case returns an error, the error message contains every problem found, not just the first one.

Minimal example

#![allow(unused)]
fn main() {
use cobre_io::load_case;
use std::path::Path;

let system = load_case(Path::new("path/to/my_case"))?;
println!("Loaded {} buses, {} hydros", system.n_buses(), system.n_hydros());
}

Return type

On success, load_case returns a cobre_core::System — an immutable, Send + Sync container holding all entity registries, topology graphs, pre-resolved penalty and bound tables, scenario models, and the stage sequence. All entity collections are in canonical ID-sorted order.

On failure, load_case returns a LoadError. See Error handling for the full set of variants and when each occurs.

Case directory structure

A valid case directory has the following layout:

my_case/
├── config.json                          # Solver configuration (required)
├── penalties.json                       # Global penalty defaults (required)
├── stages.json                          # Stage sequence and policy graph (required)
├── initial_conditions.json              # Reservoir storage at study start (required)
├── system/
│   ├── buses.json                       # Electrical buses (required)
│   ├── lines.json                       # Transmission lines (required)
│   ├── hydros.json                      # Hydro plants (required)
│   ├── thermals.json                    # Thermal plants (required)
│   ├── non_controllable_sources.json    # Intermittent sources (optional)
│   ├── pumping_stations.json            # Pumping stations (optional)
│   └── energy_contracts.json           # Bilateral contracts (optional)
├── extensions/
│   ├── hydro_geometry.parquet           # Reservoir geometry tables (optional)
│   ├── production_models.json           # FPHA production function configs (optional)
│   └── fpha_hyperplanes.parquet         # FPHA hyperplane coefficients (optional)
├── scenarios/
│   ├── inflow_seasonal_stats.parquet    # PAR model seasonal statistics (required)
│   ├── inflow_ar_coefficients.parquet   # PAR autoregressive coefficients (required)
│   ├── inflow_history.parquet           # Historical inflow series (optional)
│   ├── load_seasonal_stats.parquet      # Load model seasonal statistics (optional)
│   ├── load_factors.parquet             # Load scaling factors (optional)
│   ├── correlation.json                 # Cross-series correlation model (optional)
│   └── external_scenarios.parquet       # Pre-generated external scenarios (optional)
└── constraints/
    ├── hydro_bounds.parquet             # Stage-varying hydro bounds (optional)
    ├── thermal_bounds.parquet           # Stage-varying thermal bounds (optional)
    ├── line_bounds.parquet              # Stage-varying line bounds (optional)
    ├── pumping_bounds.parquet           # Stage-varying pumping bounds (optional)
    ├── contract_bounds.parquet          # Stage-varying contract bounds (optional)
    ├── generic_constraints.json         # User-defined LP constraints (optional)
    ├── generic_constraint_bounds.parquet # Bounds for generic constraints (optional)
    ├── exchange_factors.parquet         # Block exchange factors (optional)
    ├── penalty_overrides_hydro.parquet  # Stage-varying hydro penalty overrides (optional)
    ├── penalty_overrides_bus.parquet    # Stage-varying bus penalty overrides (optional)
    ├── penalty_overrides_line.parquet   # Stage-varying line penalty overrides (optional)
    └── penalty_overrides_ncs.parquet    # Stage-varying NCS penalty overrides (optional)

For the full JSON and Parquet schemas for each file, see the Case Format Reference.

Validation pipeline

The five layers run in sequence. Earlier layers gate later ones: if Layer 1 finds a missing required file, the file is not parsed in Layer 2. All diagnostics across all layers are collected before returning.

Case directory
      │
      ▼
┌─────────────────────────────────────────────────┐
│  Layer 1 — Structural                           │
│  Does each required file exist on disk?         │
│  Records optional-file presence in FileManifest.│
└────────────────────┬────────────────────────────┘
                     │
                     ▼
┌─────────────────────────────────────────────────┐
│  Layer 2 — Schema                               │
│  Parse JSON and Parquet. Check required fields, │
│  types, and value ranges. Collect schema errors.│
└────────────────────┬────────────────────────────┘
                     │
                     ▼
┌─────────────────────────────────────────────────┐
│  Layer 3 — Referential integrity                │
│  All cross-entity ID references must resolve.   │
│  (e.g., hydro.bus_id must exist in buses list)  │
└────────────────────┬────────────────────────────┘
                     │
                     ▼
┌─────────────────────────────────────────────────┐
│  Layer 4 — Dimensional consistency              │
│  Optional per-entity files must cover every     │
│  entity that needs them. (e.g., inflow stats    │
│  must exist for every hydro plant)              │
└────────────────────┬────────────────────────────┘
                     │
                     ▼
┌─────────────────────────────────────────────────┐
│  Layer 5 — Semantic                             │
│  Domain business rules: acyclic cascade,        │
│  penalty ordering, PAR stationarity, stage      │
│  count consistency, and other invariants.       │
└────────────────────┬────────────────────────────┘
                     │
                     ▼ (all layers pass)
              Resolution + Assembly
              System construction
                     │
                     ▼
              Ok(System)

What each layer checks

Layer 1 (Structural): Verifies that the four root-level required files (config.json, penalties.json, stages.json, initial_conditions.json) and the four required entity files (system/buses.json, system/lines.json, system/hydros.json, system/thermals.json) exist. Optional files are noted in the FileManifest but their absence is not an error. The FileManifest is passed to Layer 2 so that optional-file parsers are only called when the files are present.

Layer 2 (Schema): Parses every file found by Layer 1. For JSON files, deserialization uses serde with strict field requirements — missing required fields and unknown field values surface immediately. For Parquet files, column presence and data types are verified. Post-deserialization checks catch domain range violations (for example, negative capacity values) that serde cannot express. All parse and schema errors are collected by ValidationContext.

Layer 3 (Referential integrity): Checks all cross-entity foreign-key references. Examples: every hydro.bus_id must name a bus in the bus registry; every line.source_bus_id and line.target_bus_id must resolve; every pumping_station.source_hydro_id and destination_hydro_id must resolve; every bound override row’s entity ID must match a known entity. All broken references are collected before returning.

Layer 4 (Dimensional consistency): Verifies cross-file entity coverage. When scenarios/inflow_seasonal_stats.parquet is present, every hydro plant must have at least one row of statistics. When scenarios/inflow_ar_coefficients.parquet is present, the AR order must be consistent with the number of coefficient rows. Other coverage checks ensure that optional per-entity Parquet files do not silently omit entities.

Layer 5 (Semantic): Enforces domain invariants that span multiple files or require reasoning about the system as a whole:

  • Acyclic cascade. The hydro downstream_id graph must be a directed forest (no cycles). A topological sort detects cycles.
  • Penalty ordering. Violation penalty tiers must be ordered: lower-tier penalties may not exceed upper-tier penalties for the same entity.
  • PAR model stationarity. Seasonal inflow statistics must satisfy the stationarity requirements of the PAR(p) model.
  • Stage count consistency. The number of stages must match across stages.json, scenario data, and any stage-varying Parquet files.

Penalty and bound resolution

After all five validation layers pass, load_case resolves the three-tier penalty and bound cascades into flat lookup tables embedded in the System.

Three-tier cascade

Penalty and bound values follow a three-tier precedence cascade:

Tier 1 — Global defaults (penalties.json)
    ↓ overridden by
Tier 2 — Entity-level overrides (system/*.json fields)
    ↓ overridden by
Tier 3 — Stage-varying overrides (constraints/penalty_overrides_*.parquet)

Tier-1 and tier-2 resolution happen during entity parsing (Layer 2). By the time the resolution step runs, each entity struct already holds its tier-2 resolved value in the relevant penalty or bound field.

The resolution step applies tier-3 stage-varying overrides from the optional Parquet files. For each (entity, stage) pair, the resolved value is:

  • The tier-3 override from the Parquet row, if a row exists for that pair.
  • Otherwise, the tier-2 value already stored in the entity struct.

Sparse expansion

Tier-3 overrides are stored sparsely: a Parquet row only needs to exist for stages where the override differs from the entity-level value. The resolution step expands this sparse representation into a dense [n_entities × n_stages] array for O(1) solver lookup at construction time.

Result

Resolution produces two pre-resolved tables stored on System:

  • ResolvedPenalties — per-(entity, stage) penalty values for buses, hydros, lines, and non-controllable sources.
  • ResolvedBounds — per-(entity, stage) upper and lower bound values for hydros, thermals, lines, pumping stations, and energy contracts.

Both tables use dense flat arrays with positional entity indexing (entity position in the canonical ID-sorted slice becomes its array index).

Config struct

Config is the in-memory representation of config.json. Use parse_config to load it independently of load_case:

#![allow(unused)]
fn main() {
use cobre_io::config::parse_config;
use std::path::Path;

let cfg = parse_config(Path::new("my_case/config.json"))?;
println!("forward_passes = {:?}", cfg.training.forward_passes);
}

Config has six sections:

SectionTypeDefaultPurpose
modelingModelingConfig{}Inflow non-negativity treatment method and cost
trainingTrainingConfig(required)Iteration count, stopping rules, cut selection
upper_bound_evaluationUpperBoundEvaluationConfig{}Inner approximation upper-bound evaluation settings
policyPolicyConfigfresh modePolicy directory path, warm-start / resume mode
simulationSimulationConfigdisabledPost-training simulation scenario count and output
exportsExportsConfigall onFlags controlling which output files are written

Mandatory fields

Two fields in training have no defaults and must be present in config.json. parse_config returns LoadError::SchemaError if either is absent:

  • training.forward_passes — number of scenario trajectories per iteration (integer, >= 1)
  • training.stopping_rules — list of stopping rule entries (must include at least one iteration_limit rule)

Stopping rules

The training.stopping_rules array accepts four rule types, identified by the "type" field:

TypeRequired fieldsStops when
iteration_limitlimit: u32Iteration count reaches limit
time_limitseconds: f64Wall-clock time exceeds seconds
bound_stallingiterations: u32, tolerance: f64Lower bound improvement falls below tolerance
simulationreplications, period, bound_window, distance_tol, bound_tolPolicy and bound have both stabilized

Multiple rules combine according to training.stopping_mode: "any" (default, OR semantics — stop when any rule triggers) or "all" (AND semantics — stop only when all rules trigger simultaneously).

Policy modes

The policy.mode field controls warm-start behavior:

ModeBehavior
"fresh"(default) Start from scratch; no policy files are read
"warm_start"Load existing cuts and states from policy.path as a starting approximation
"resume"Resume an interrupted run from the last checkpoint

When mode is "warm_start" or "resume", load_case also validates policy compatibility: the stored policy’s entity counts, stage count, and cut dimensions must match the current case. Mismatches return LoadError::PolicyIncompatible.

Error handling

All errors returned by load_case and its internal parsers are variants of LoadError:

IoError

I/O error reading {path}: {source}

Occurs when a required file exists in the file manifest but cannot be read from disk (file not found, permission denied, or other OS-level I/O failure). Fields: path: PathBuf (the file that failed) and source: std::io::Error (the underlying error).

When it occurs: Layer 1 or Layer 2, when std::fs::read_to_string or a Parquet reader returns an error for a required file.

ParseError

parse error in {path}: {message}

Occurs when a file is readable but its content is malformed — invalid JSON syntax, unexpected end of input, or an unreadable Parquet column header. Fields: path: PathBuf and message: String (description of the parse failure).

When it occurs: Layer 2, during initial deserialization of JSON or Parquet files before any field-level validation runs.

SchemaError

schema error in {path}, field {field}: {message}

Occurs when a file parses successfully but a field violates a schema constraint: a required field is missing, a value is outside its valid range, or an enum discriminator names an unknown variant. Fields: path: PathBuf, field: String (dot-separated path to the offending field, e.g., "hydros[3].bus_id"), and message: String.

When it occurs: Layer 2, during post-deserialization validation. Also returned by parse_config when training.forward_passes or training.stopping_rules is absent.

CrossReferenceError

cross-reference error: {source_entity} in {source_file} references
non-existent {target_entity} in {target_collection}

Occurs when an entity ID field references an entity that does not exist in the expected registry. Fields: source_file: PathBuf, source_entity: String (e.g., "Hydro 'H1'"), target_collection: String (e.g., "bus registry"), and target_entity: String (e.g., "BUS_99").

When it occurs: Layer 3 (referential integrity). All broken references across all entity types are collected before returning.

ConstraintError

constraint violation: {description}

A catch-all for collected validation errors from any of the five layers, and for SystemBuilder::build() rejections. The description field contains all error messages joined by newlines, each prefixed with its [ErrorKind], source file, optional entity identifier, and message text.

When it occurs: After any validation layer collects one or more error-severity diagnostics, or when SystemBuilder::build() finds duplicate IDs or a cascade cycle in the final construction step.

PolicyIncompatible

policy incompatible: {check} mismatch — policy has {policy_value},
system has {system_value}

Occurs when a warm-start or resume policy file is structurally incompatible with the current case. The four compatibility checks are: hydro count, stage count, cut dimension, and entity identity hash. Fields: check: String (name of the failing check), policy_value: String, and system_value: String.

When it occurs: After all five validation layers pass, when policy.mode is "warm_start" or "resume" and the stored policy fails a compatibility check.

Design notes

Collect-all validation. Unlike parsers that short-circuit on the first error, all five validation layers collect diagnostics into a shared ValidationContext before failing. When load_case returns a ConstraintError, the description field contains every problem found in a single report. This avoids the frustrating fix-one-error-re-run-repeat cycle on large cases.

File-format split. Entity identity data (IDs, names, topology, static parameters) lives in JSON. Time-varying and per-stage data (bounds, penalty overrides, statistical parameters, scenarios) lives in Parquet. JSON is easy to read and edit by hand; Parquet handles large numeric tables efficiently. The two formats complement each other without overlap.

Resolution separates concerns. The three-tier cascade is resolved once at load time into dense arrays, not at every solver call. Downstream solver crates call system.penalties().hydro(entity_idx, stage_idx) and get an f64 with no branching, no hash lookups, and no tier logic. The complexity of the cascade is entirely contained in cobre-io.

Declaration-order invariance. All entity collections are sorted by ID before SystemBuilder::build() is called. Any System built from the same entities, regardless of the order they appear in the input files, produces a structurally identical result with identical pre-resolved tables.

cobre-stochastic

experimental

cobre-stochastic provides the stochastic process models for the Cobre power systems ecosystem. It builds probabilistic representations of hydro inflow time series — using Periodic Autoregressive (PAR(p)) models — and generates correlated noise scenarios for use by iterative scenario-based optimization algorithms. The crate is solver-agnostic: it supplies fully-initialized stochastic infrastructure components that any scenario-based iterative optimization algorithm can consume read-only, with no dependency on any particular solver vertical.

The crate has no dependency on cobre-solver or cobre-comm. It depends only on cobre-core for entity types and on a small set of RNG and hashing crates for deterministic noise generation.

Module overview

ModulePurpose
parPAR(p) coefficient preprocessing: validation, original-unit conversion, and the PrecomputedParLp cache
noiseDeterministic noise generation: SipHash-1-3 seed derivation (seed) and Pcg64 RNG construction (rng)
correlationCholesky-based spatial correlation: decomposition (cholesky) and profile resolution (resolve)
treeOpening scenario tree: flat storage structure (opening_tree) and tree generation (generate)
samplingInSample scenario selection: sample_forward for picking an opening for a given iteration/scenario/stage
contextStochasticContext integration type and build_stochastic_context pipeline entry point
errorStochasticError with five variants covering all failure domains of the stochastic layer

Architecture

PAR(p) preprocessing and flat array layout

PAR(p) (Periodic Autoregressive) models describe the seasonal autocorrelation structure of hydro inflow time series. Each hydro plant at each stage has an InflowModel with a mean (mean_m3s), a standard deviation (std_m3s), and a vector of AR coefficients in standardized form (ar_coefficients).

PrecomputedParLp is built once at initialization from raw InflowModel parameters. It converts AR coefficients from standardized form (ψ*, direct Yule-Walker output) to original-unit form at build time:

ψ_{m,ℓ} = ψ*_{m,ℓ} · s_m / s_{m-ℓ}

where s_m is std_m3s for the current stage’s season and s_{m-ℓ} is std_m3s for the season ℓ stages prior. The converted coefficients and their derived intercepts (base) are stored in stage-major flat arrays:

array[stage * n_hydros + hydro]          (2-D: means, stds, base terms)
psi[stage * n_hydros * max_order + hydro * max_order + lag]  (3-D: AR coefficients)

This layout ensures that all per-stage data for every hydro plant is contiguous in memory, maximizing cache utilization during sequential stage iteration within a scenario trajectory.

All hot-path arrays use Box<[f64]> (via Vec::into_boxed_slice()) rather than Vec<f64>. The boxed-slice type communicates the no-resize invariant and eliminates the capacity word from each allocation.

Deterministic noise via SipHash-1-3 seed derivation (DEC-017)

Each scenario realization in an iterative optimization run requires a draw from the noise distribution. Rather than broadcasting seeds across compute nodes — which would require communication — each node independently derives its own seed from a small tuple using SipHash-1-3 (DEC-017).

Two derivation functions are provided:

  • derive_forward_seed(base_seed, iteration, scenario, stage) -> u64: hashes a 20-byte little-endian wire format base_seed (8B) ++ iteration (4B) ++ scenario (4B) ++ stage (4B).
  • derive_opening_seed(base_seed, opening_index, stage) -> u64: hashes a 16-byte wire format base_seed (8B) ++ opening_index (4B) ++ stage (4B).

The different wire lengths provide domain separation without explicit prefixes, preventing hash collisions between forward-pass seeds and opening-tree seeds. stage in both functions is always stage.id (the domain identifier), never stage.index (the array position), because array positions shift under stage filtering while IDs are stable.

From the derived seed, a Pcg64 RNG is constructed via rng_from_seed. The PCG family provides good statistical quality with fast generation, suitable for producing large numbers of standard-normal samples via the StandardNormal distribution.

Cholesky-based spatial correlation

Hydro inflow series at neighboring plants are spatially correlated. cobre-stochastic applies a Cholesky transformation to convert independent standard-normal samples into correlated samples.

The Cholesky decomposition is hand-rolled using the Cholesky-Banachiewicz algorithm (~150 lines). No external linear algebra crate is added to the dependency tree. The lower-triangular factor L (such that Sigma = L * L^T) is stored in packed lower-triangular format: element (i, j) with j <= i is at index i*(i+1)/2 + j. This eliminates the zero upper-triangle entries and halves memory usage.

Correlation profiles can be defined per-season. DecomposedCorrelation holds all profiles in a BTreeMap<String, Vec<GroupFactor>> — the BTreeMap guarantees deterministic iteration order, which is required for declaration-order invariance.

Before entering the hot optimization loop, callers must invoke DecomposedCorrelation::resolve_positions(&mut self, entity_order: &[EntityId]) once. This pre-computes the positions of each group’s entities within the canonical entity order and stores them on each GroupFactor as Option<Box<[usize]>>. With positions pre-computed, apply_correlation avoids a per-call O(n) linear scan and heap allocation on the hot path.

If a correlation group’s entity IDs are only partially present in entity_order, the Cholesky transform is skipped for that group entirely. Entities not in any group retain their independent noise values unchanged.

Opening tree structure

The opening scenario tree pre-generates all noise realizations used during the backward pass of the optimization algorithm, before the iterative loop begins. This avoids per-iteration recomputation and ensures the backward pass always operates on a fixed, reproducible set of scenarios.

OpeningTree stores all noise values in a single flat contiguous array with stage-major ordering:

data[stage_offsets[stage] + opening_idx * dim .. + dim]

The stage_offsets array has length n_stages + 1. The sentinel entry stage_offsets[n_stages] equals data.len(), making bounds checks exact without special-casing the last stage. This sentinel pattern is used consistently in PrecomputedParLp, OpeningTree, and throughout StochasticContext.

Pre-study stages (those with negative stage.id) are excluded from the opening tree but remain in inflow_models for PAR lag initialization.

StochasticContext as the integration entry point

StochasticContext bundles the three independently-built components into a single ready-to-use value:

  1. PrecomputedParLp — PAR coefficient cache for LP RHS patching.
  2. DecomposedCorrelation — pre-decomposed Cholesky factors for all profiles.
  3. OpeningTree — pre-generated noise realizations for the backward pass.

build_stochastic_context(&system, base_seed) runs the full preprocessing pipeline in a fixed order: validate PAR parameters, build the coefficient cache, decompose correlation matrices, generate the opening tree. After construction, all fields are immutable. StochasticContext is Send + Sync, verified by a compile-time assertion and a unit test.

sample_forward for InSample scenario selection

sample_forward implements the InSample scenario selection strategy: for each (iteration, scenario, stage) triple, it deterministically selects one opening from the tree by deriving a seed via derive_forward_seed and sampling a Pcg64 RNG. The selected opening index and its noise slice are returned together, so the caller can both log which opening was chosen and immediately use the noise values.

Public types

StochasticContext

Owns all three preprocessing pipeline outputs: PrecomputedParLp, DecomposedCorrelation, and OpeningTree. Constructed by build_stochastic_context and then consumed read-only. Accessors: par_lp(), correlation(), opening_tree(), tree_view(), base_seed(), dim(), n_stages(). Both Send and Sync.

PrecomputedParLp

Cache-friendly PAR(p) model data for LP RHS patching. Stores means, standard deviations, original-unit AR coefficients (ψ), and intercept terms (base) in stage-major flat arrays (Box<[f64]>). Built via PrecomputedParLp::build. Accessors: n_hydros(), n_stages(), max_order(), mean(), std(), base(), psi().

DecomposedCorrelation

Holds Cholesky-decomposed correlation factors for all profiles, keyed by profile name in a BTreeMap. Built via DecomposedCorrelation::build, which validates and decomposes all profiles eagerly — errors surface at initialization, not at per-stage lookup time. Call resolve_positions once with the canonical entity order before entering the optimization loop.

OpeningTree

Fixed opening scenario tree holding pre-generated noise realizations. All noise values are in a flat Box<[f64]> with stage-major ordering and a sentinel offset array of length n_stages + 1. Provides opening(stage_idx, opening_idx) -> &[f64] for element access and view() -> OpeningTreeView<'_> for a zero-copy borrowed view.

OpeningTreeView<'a>

A zero-copy borrowed view over an OpeningTree, with the same accessor API: opening(stage_idx, opening_idx), n_stages(), n_openings(stage_idx), dim(). Passed to sample_forward to avoid cloning the tree data.

StochasticError

Returned by all fallible APIs. Five variants:

VariantWhen it occurs
InvalidParParametersAR order > 0 with zero standard deviation, or ill-conditioned coefficients
CholeskyDecompositionFailedCorrelation matrix is not positive-definite
InvalidCorrelationMissing default profile, ambiguous profile set, or out-of-range correlation entry
InsufficientDataFewer historical records than the PAR order requires
SeedDerivationErrorHash computation produces an invalid result during seed derivation

Implements std::error::Error, Send, and Sync.

ParValidationReport

Return type of validate_par_parameters. Contains a list of ParWarning values for non-fatal issues (e.g., high AR coefficients that may indicate numerical instability) that the caller can inspect or log before proceeding to PrecomputedParLp::build.

ParWarning

A non-fatal PAR parameter warning. Carries the hydro ID, stage ID, and a human-readable description of the potential issue.

GroupFactor

A single correlation group’s Cholesky factor with its associated entity ID mapping. Fields: factor: CholeskyFactor, entity_ids: Vec<EntityId>, and pre-computed positions: Option<Box<[usize]>> (filled by resolve_positions).

CholeskyFactor

The lower-triangular Cholesky factor L of a correlation matrix, stored in packed row-major form. Element (i, j) with j <= i is at index i*(i+1)/2 + j. Constructed via CholeskyFactor::decompose(&matrix) and applied via transform(&input, &mut output).

Usage example

The following shows how to construct a stochastic context from a loaded system and use it to sample a forward-pass scenario.

#![allow(unused)]
fn main() {
use cobre_stochastic::{
    build_stochastic_context,
    sampling::insample::sample_forward,
};

// `system` is a `cobre_core::System` produced by `cobre_io::load_case`.
// `base_seed` comes from the study configuration (application layer handles
// the Option<i64> -> u64 conversion and OS-entropy fallback).
let ctx = build_stochastic_context(&system, base_seed)?;

println!(
    "stochastic context: {} hydros, {} study stages",
    ctx.dim(),
    ctx.n_stages(),
);

// Obtain a borrowed view over the opening tree (zero-copy).
let tree_view = ctx.tree_view();

// In the iterative optimization loop, select a forward scenario for each
// (iteration, scenario, stage) triple.
let iteration: u32 = 0;
let scenario: u32 = 0;

for (stage_idx, stage) in study_stages.iter().enumerate() {
    // stage.id is the domain identifier; stage_idx is the array position.
    let (opening_idx, noise_slice) = sample_forward(
        &tree_view,
        ctx.base_seed(),
        iteration,
        scenario,
        stage.id as u32,
        stage_idx,
    );

    // `noise_slice` has length `ctx.dim()` (one value per hydro plant).
    // Pass to LP RHS patching together with `ctx.par_lp()`.
    let _ = (opening_idx, noise_slice);
}
Ok::<(), cobre_stochastic::StochasticError>(())
}

Performance notes

cobre-stochastic is designed so that all performance-critical preprocessing happens once at initialization. The iterative optimization loop consumes already-materialized data through slice indexing, with no re-allocation on the hot path.

Pre-computed entity positions (resolve_positions)

DecomposedCorrelation::resolve_positions must be called once before entering the optimization loop. It pre-computes the mapping from each correlation group’s entity IDs to their positions in the canonical entity_order slice and stores the result as Option<Box<[usize]>> on each GroupFactor. Without this pre-computation, apply_correlation would perform an O(n) linear scan and a Vec allocation for every noise draw.

Stack-allocated buffers for small groups (MAX_STACK_DIM = 64)

Inside apply_correlation, intermediate working buffers for correlation groups with at most 64 entities are stack-allocated (using arrayvec or a fixed-size array on the stack). Groups larger than this threshold fall back to heap-allocated Vec. The fast path covers the overwhelming majority of practical correlation groups, eliminating heap allocation from the inner loop for typical study configurations.

Incremental row_base in Cholesky transform

The packed lower-triangular storage index for element (i, j) is i*(i+1)/2 + j. Rather than recomputing the triangular index from scratch for each row, the transform method maintains an incremental row_base variable that is incremented by i+1 at the end of each row. This eliminates a multiplication per row iteration on the hot path of the Cholesky forward substitution.

Box<[f64]> for the no-resize invariant

All fixed-size hot-path arrays in PrecomputedParLp, OpeningTree, and CholeskyFactor use Box<[f64]> rather than Vec<f64>. The boxed-slice type communicates that these arrays are immutable after construction, eliminates the capacity word from each allocation, and allows the optimizer to treat the length as a compile-time-stable bound.

Feature flags

cobre-stochastic has no optional feature flags. All dependencies are always compiled. No external system libraries are required (HiGHS, MPI, etc.).

# Cargo.toml
cobre-stochastic = { version = "0.1" }

Testing

Running the test suite

cargo test -p cobre-stochastic

No external dependencies or system libraries are required. All dependencies (siphasher, rand, rand_pcg, rand_distr, thiserror) are Cargo-managed. The --all-features flag is not needed — there are no feature flags.

Test suite overview

The crate has 125 tests total: 105 unit tests, 5 conformance integration tests, 4 reproducibility integration tests, and 11 doc-tests.

Conformance suite (tests/conformance.rs)

The conformance test suite verifies the PAR(p) preprocessing pipeline against hand-computed fixtures with known exact outputs.

Two fixtures are used:

  • AR(0) fixture: a zero-order AR model (pure noise, no lagged terms). The precomputed psi array must be all-zeros and the base values must equal the raw means. Tolerance: 1e-10.
  • AR(1) fixture: a first-order AR model with a pre-study stage (negative stage.id) that supplies the lag mean and standard deviation for coefficient unit conversion. The conversion formula ψ = ψ* · s_m / s_lag is tested against a hand-computed value. Tolerance: 1e-10.

Reproducibility suite (tests/reproducibility.rs)

Four tests verify the determinism and invariance properties that are required for correct behavior in a distributed, multi-run setting:

  • Seed determinism: calling derive_forward_seed and derive_opening_seed with the same inputs always returns bitwise-identical seeds. Golden-value regression pins the exact hash output for a known (base_seed, ...) tuple.
  • Opening tree seed sensitivity: different base_seed values produce different opening trees (verified by checking that at least one noise value differs across the full tree). Uses any() over all tree entries rather than assert_ne! on the whole tree, to handle the astronomically unlikely case where two seeds produce one identical value.
  • Declaration-order invariance: inserting hydros in reversed order into a SystemBuilder (which sorts by EntityId internally) produces a StochasticContext with bitwise-identical PAR arrays, opening tree, and Cholesky transform output. This verifies the canonical-order invariant across the full preprocessing pipeline.
  • Infrastructure genericity gate: a grep audit confirms that no algorithm-specific references appear anywhere in the crate source tree. The gate is encoded as a #[test] using std::process::Command so it runs automatically in CI.

Design notes

Communication-free noise generation (DEC-017)

The original design considered broadcasting a seed from the root rank to all workers before each iteration. DEC-017 rejected this approach because it adds an MPI collective on the hot path and creates a serialization point as the number of ranks grows.

The alternative — deriving each rank’s seeds independently from a common base_seed plus a context tuple — requires no communication and produces identical results regardless of the number of ranks. SipHash-1-3 was chosen because it is non-cryptographic (fast), produces high-quality 64-bit hashes suitable for seeding a CSPRNG, and is available in the siphasher crate with no system dependencies.

The two wire formats (20 bytes for forward seeds, 16 bytes for opening seeds) use length-based domain separation rather than an explicit prefix byte, which is slightly more efficient and equally correct given that the two sets of input tuples have different shapes and lengths.

cobre-solver

experimental

cobre-solver is the LP solver abstraction layer for the Cobre ecosystem. It defines a backend-agnostic interface for constructing, solving, and querying linear programs, with a production-grade HiGHS backend as the default implementation.

The crate has no dependency on any other Cobre crate. It is infrastructure that optimization algorithm crates consume through a generic type parameter, not a shared registry or runtime-selected component. Every solver method call compiles directly to the concrete backend implementation — there is no virtual dispatch overhead on the hot path where iterative LP solving occurs.

Module overview

ModulePurpose
ffiRaw unsafe FFI bindings to the cobre_highs_* C wrapper functions
typesCanonical data types: StageTemplate, RowBatch, Basis, LpSolution, SolutionView, SolverError, SolverStatistics
trait_defSolverInterface trait definition with all 10 method contracts
highsHighsSolver — the HiGHS backend implementing SolverInterface
(root)Re-exports: SolverInterface, HighsSolver, and all public types

The ffi and highs modules are compiled only when the highs feature is enabled (the default). The trait_def and types modules are always compiled, making it possible to write algorithm code against SolverInterface without depending on any particular backend.

Architecture

Compile-time monomorphization (DEC-002)

SolverInterface is resolved as a generic type parameter at compile time, not as Box<dyn SolverInterface> or any other form of dynamic dispatch. An optimization algorithm crate parameterizes its entry point as:

#![allow(unused)]
fn main() {
fn run<S: SolverInterface>(solver_factory: impl Fn() -> S, ...) { ... }
}

The compiler generates one concrete implementation per backend. The HiGHS backend is the only active backend in a standard build; the binary contains no solver-selection branch. This is specified in DEC-002 and implemented in ADR-003.

Custom FFI — not highs-sys

cobre-solver does not use any third-party highs-sys crate. Instead it ships a thin C wrapper (csrc/highs_wrapper.c) that exposes the 20-odd HiGHS C API functions needed by the backend as cobre_highs_* symbols. This approach:

  • Controls exactly which HiGHS API surface is exposed.
  • Allows the wrapper to enforce Cobre-specific invariants before delegating to the underlying Highs_* calls.
  • Avoids a build-time dependency on any external Rust crate for FFI bindings.

The ffi module declares extern "C" signatures for each cobre_highs_* function. All FFI calls are unsafe; safe wrappers live in highs.rs.

Vendored HiGHS build

HiGHS is compiled from source at build time via the cmake crate. The source lives in crates/cobre-solver/vendor/HiGHS/ as a git submodule. The build script (crates/cobre-solver/build.rs) invokes cmake with a fixed Release configuration and links the resulting static library. HiGHS is always built in Release mode regardless of the Cargo profile, because a debug HiGHS build is roughly 10x slower and would produce misleading performance results.

Per-crate unsafe override

The workspace lint configuration forbids unsafe code at the workspace level. cobre-solver overrides this lint to allow in its own Cargo.toml because the HiGHS FFI layer genuinely requires unsafe blocks. All other workspace lints (missing_docs, unwrap_used, clippy pedantic) remain active. Every unsafe block carries a // SAFETY: comment explaining the invariants that justify it.

SolverInterface trait

#![allow(unused)]
fn main() {
pub trait SolverInterface: Send { ... }
}

The trait defines 10 methods that together constitute the full LP lifecycle for one solver instance. Implementations must satisfy the pre- and post-condition contracts documented in each method’s rustdoc. See the trait_def rustdoc for the complete contracts.

Method summary

Method&self / &mut selfReturnsDescription
load_model&mut self()Bulk-loads a structural LP from a StageTemplate; replaces any prior model
add_rows&mut self()Appends a RowBatch of constraint rows to the dynamic region
set_row_bounds&mut self()Updates row lower/upper bounds at indexed positions
set_col_bounds&mut self()Updates column lower/upper bounds at indexed positions
solve&mut selfResult<SolutionView<'_>, SolverError>Solves the current LP; encapsulates internal retry logic
solve_with_basis&mut selfResult<SolutionView<'_>, SolverError>Sets a cached basis, then solves (warm-start path)
reset&mut self()Clears solver state for error recovery or model switch
get_basis&mut self()Writes basis status codes into a caller-owned &mut Basis
statistics&selfSolverStatisticsReturns accumulated monotonic solve counters
name&self&'static strReturns a static string identifying the backend

Mutability convention

Methods that mutate solver state — loading a model, adding constraints, patching bounds, solving, resetting, and extracting a basis — take &mut self. get_basis requires &mut self because it writes to internal scratch buffers during extraction. Methods that only read accumulated state (statistics, name) take &self. This convention makes data-race hazards visible at the type level: the borrow checker prevents concurrent mutation without locks.

Error recovery contract

When solve or solve_with_basis returns Err, the solver’s internal state is unspecified. The caller is responsible for calling reset() before reusing the instance. Failing to reset after a terminal error may produce incorrect results or panics on the next load_model call.

Thread safety

SolverInterface requires Send but not Sync. Send allows a solver instance to be transferred to a worker thread at startup. The absence of Sync prevents concurrent access from multiple threads, which matches the reality of C-library solver handles: they maintain mutable factorization workspaces that are not thread-safe. Each worker thread owns exactly one solver instance.

Public types

StageTemplate

Pre-assembled structural LP for one stage, in CSC (column-major) form. Built once at initialization from resolved internal structures and shared read-only across all threads. Passed to load_model to bulk-load the LP. Fields include the CSC matrix arrays (col_starts, row_indices, values), bounds, objective coefficients, and layout metadata (n_state, n_transfer, n_dual_relevant, n_hydro, max_par_order) used by the calling algorithm for state transfer and cut extraction. See the StageTemplate rustdoc.

RowBatch

Batch of constraint rows for addition to a loaded LP, in CSR (row-major) form. Assembled from an active constraint pool before each LP rebuild and passed to add_rows in a single call. Appended rows occupy the dynamic constraint region of the LP matrix. See the RowBatch rustdoc.

Basis

Raw simplex basis stored as solver-native i32 status codes — one per column and one per row. The codes are opaque to the calling algorithm; they are extracted from one solve via get_basis and passed back to the next via solve_with_basis for warm-starting. Stored in the original (unpresolved) problem space for portability across solver versions and presolve strategies. When the LP gains new dynamic constraint rows after a basis was saved, solve_with_basis handles the dimension mismatch by filling new row slots with the solver-native “Basic” code. See the Basis rustdoc.

SolutionView<'a>

Zero-copy borrowed view over solver-internal buffers, returned by solve and solve_with_basis. Provides objective(), primal(), dual(), reduced_costs(), iterations(), and solve_time_seconds() as slice references into the solver’s internal arrays. The view borrows the solver and is valid until the next &mut self call. Call to_owned() to copy the data into an LpSolution when the solution must outlive the borrow. See the SolutionView rustdoc.

LpSolution

Owned solution produced by SolutionView::to_owned(): objective (f64, minimization sense), primal (Vec of column values), dual (Vec of row dual multipliers, normalized to the canonical sign convention), reduced_costs, iterations, and solve_time_seconds. Dual values are normalized before the struct is returned — HiGHS row duals are already in the canonical convention and require no negation. See the LpSolution rustdoc.

SolverError

Terminal LP solve error returned after all retry attempts are exhausted. Six variants correspond to six failure categories:

VariantHard stop?Diagnostic
InfeasibleYesNo
UnboundedYesNo
NumericalDifficultyNoYes
TimeLimitExceededNoYes
IterationLimitNoYes
InternalErrorYesNo

Infeasible and Unbounded are unit variants (no fields). NumericalDifficulty carries a message, TimeLimitExceeded carries elapsed_seconds, and IterationLimit carries iterations. InternalError carries message and an optional error_code. See the SolverError rustdoc.

SolverStatistics

Accumulated solve metrics for one solver instance: solve_count, success_count, failure_count, total_iterations, retry_count, total_solve_time_seconds, and basis_rejections. All counters grow monotonically from zero. reset() does not zero them — statistics persist for the lifetime of the solver instance and are aggregated across threads after iterative solving completes. See the SolverStatistics rustdoc.

HiGHS backend (HighsSolver)

Construction

#![allow(unused)]
fn main() {
pub fn new() -> Result<Self, SolverError>
}

HighsSolver::new() allocates a HiGHS handle via cobre_highs_create() and applies seven performance-tuned default options before returning:

OptionValueRationale
solver"simplex"Simplex is faster than IPM for warm-started LPs
simplex_strategy4Dual simplex; performs well on LP sequences
presolve"off"Avoid presolve overhead on repeated small LPs
parallel"off"Each thread owns one solver; no internal threads
output_flagfalseSuppress HiGHS console output
primal_feasibility_tolerance1e-7Tighter than HiGHS default for numerical stability
dual_feasibility_tolerance1e-7Same

If HiGHS handle creation or any option call fails, the handle is destroyed before returning Err(SolverError::InternalError { .. }).

5-level retry escalation

When HiGHS returns SOLVE_ERROR or UNKNOWN (not a definitive terminal status), HighsSolver::solve escalates through five retry levels before giving up:

LevelAction
0Clear the cached basis and factorization (clear_solver)
1Enable presolve (presolve = "on")
2Switch to primal simplex (simplex_strategy = 1)
3Relax feasibility tolerances (primal and dual to 1e-6)
4Switch to interior point method (solver = "ipm")

The first level that returns OPTIMAL exits the loop. If a definitive terminal status (INFEASIBLE, UNBOUNDED, TIME_LIMIT, ITERATION_LIMIT) is reached during a retry level, the loop exits immediately with the corresponding SolverError variant. If all five levels are exhausted without a result, the method returns SolverError::NumericalDifficulty. Default settings are restored unconditionally after the retry loop, regardless of outcome, so subsequent calls see the standard configuration.

The retry sequence is entirely internal — the caller of solve never sees intermediate failures, only the final Ok(LpSolution) or Err(SolverError).

Dual normalization

HiGHS row duals are already in the canonical Cobre sign convention: a positive dual on a <= constraint means increasing the RHS increases the objective. HighsSolver::extract_solution copies row_dual directly into LpSolution.dual without negation. The col_dual from HiGHS is the reduced cost vector and is placed in LpSolution.reduced_costs.

Warm-start basis management

solve_with_basis loads the Basis status codes directly into HiGHS via Highs_setBasis. When the saved basis has fewer rows than the current LP (because new dynamic constraint rows were added since the basis was extracted), the extra rows are filled with the HiGHS “Basic” status code (1). When the saved basis has more rows than the current LP, the extra entries are truncated. If HiGHS rejects the basis (returns HIGHS_STATUS_ERROR from Highs_setBasis), the method falls back to a cold-start solve and increments SolverStatistics.basis_rejections. After setting the basis, solve_with_basis delegates to solve(), which handles the retry escalation sequence.

SoA bound patching (DEC-019)

The set_row_bounds and set_col_bounds methods take three separate slices:

#![allow(unused)]
fn main() {
fn set_row_bounds(&mut self, indices: &[usize], lower: &[f64], upper: &[f64]);
fn set_col_bounds(&mut self, indices: &[usize], lower: &[f64], upper: &[f64]);
}

This is a Structure of Arrays (SoA) signature. The alternative — a single slice of (usize, f64, f64) tuples (Array of Structures, AoS) — would require the caller to convert from its natural SoA representation before the call, and the HiGHS C API (Highs_changeRowsBoundsBySet) would then expect SoA again, producing a double conversion on the hottest solver path.

DEC-019 documents the rationale: the calling algorithm naturally holds separate index, lower-bound, and upper-bound arrays; the C API expects separate arrays; so the trait signature matches both, eliminating any intermediate conversion. The performance impact is meaningful because bound patching happens at every scenario realization, which occurs on the innermost loop of iterative LP solving.

Usage example

The following shows the complete LP rebuild sequence for one stage: load the structural model, append active constraint rows, patch scenario-specific row bounds, solve, and extract the basis for the next iteration.

use cobre_solver::{
    Basis, HighsSolver, LpSolution, RowBatch, SolverError,
    SolverInterface, StageTemplate,
};

fn solve_stage(
    solver: &mut HighsSolver,
    template: &StageTemplate,
    cuts: &RowBatch,
    row_indices: &[usize],
    lower: &[f64],
    upper: &[f64],
    cached_basis: Option<&Basis>,
    basis_buf: &mut Basis,
) -> Result<LpSolution, SolverError> {
    // Step 1: load structural LP (replaces any prior model).
    solver.load_model(template);

    // Step 2: append active constraint rows.
    solver.add_rows(cuts);

    // Step 3: patch row bounds for this scenario realization.
    solver.set_row_bounds(row_indices, lower, upper);

    // Step 4: solve, optionally warm-starting from a cached basis.
    let view = match cached_basis {
        Some(basis) => solver.solve_with_basis(basis)?,
        None => solver.solve()?,
    };

    // Step 5: copy the zero-copy view into an owned solution.
    let solution = view.to_owned();

    // Step 6: extract basis into the caller-owned buffer for warm-starting.
    solver.get_basis(basis_buf);

    Ok(solution)
}

fn main() -> Result<(), SolverError> {
    let mut solver = HighsSolver::new()?;
    assert_eq!(solver.name(), "HiGHS");

    // Print cumulative statistics after a run.
    let stats = solver.statistics();
    println!(
        "solves={} successes={} retries={}",
        stats.solve_count, stats.success_count, stats.retry_count
    );

    Ok(())
}

Build requirements

Git submodule

HiGHS is vendored as a git submodule at crates/cobre-solver/vendor/HiGHS/. Before building cobre-solver for the first time (or after a fresh clone), initialize the submodule:

git submodule update --init --recursive

The build script checks for crates/cobre-solver/vendor/HiGHS/CMakeLists.txt and panics with a clear error message if the submodule is not initialized.

System dependencies

DependencyMinimum versionNotes
cmake3.15Required by the HiGHS build system
C compilerC11gcc or clang; HiGHS and the C wrapper are C/C++
C++ compilerC++17Required by HiGHS internals
zlibanyNot needed — disabled via CMAKE_DISABLE_FIND_PACKAGE_ZLIB

Feature flags

FeatureDefaultDescription
highsyesEnables the HiGHS backend and the build script

Without the highs feature, only SolverInterface, the type definitions, and the ffi module stubs are compiled. The HighsSolver struct is not available. Additional solver backends (CLP, commercial solvers) are planned behind their own feature flags but are not yet implemented.

Testing

Running the test suite

cargo test -p cobre-solver --features highs

This requires cmake, a C/C++ compiler, and an initialized crates/cobre-solver/vendor/HiGHS/ submodule (see Build requirements).

Conformance suite (tests/conformance.rs)

The integration test file tests/conformance.rs implements the backend-agnostic conformance contract from the Solver Interface Testing spec. It verifies the SolverInterface contract using only the public API against the HighsSolver concrete type. The fixture LP is a 3-variable, 2-constraint minimization problem (the SS1.1 fixture) with known optimal solution (x0=6, x1=0, x2=2, obj=100.0).

The conformance suite covers:

  • load_model loads a structural LP and produces the expected objective and primal values on solve.
  • load_model fully replaces a previous model when called a second time.
  • add_rows appends constraint rows without altering structural rows.
  • set_row_bounds patches bounds and the re-solve reflects the new bounds.
  • solve_with_basis warm-starts successfully and returns the correct optimal solution.
  • get_basis returns a basis with the correct column and row count after a successful solve.
  • statistics counters increment correctly across solve calls.
  • reset clears model state, allowing load_model to be called again cleanly.

Unit tests

src/highs.rs and src/types.rs carry #[cfg(test)] unit tests covering individual methods in isolation, including the NoopSolver in src/trait_def.rs that verifies SolverInterface compiles as a generic bound and satisfies the Send requirement.

cobre-comm

experimental

cobre-comm is the pluggable communication backend abstraction for the Cobre ecosystem. It defines the Communicator and SharedMemoryProvider traits that decouple distributed computations from specific communication technologies, allowing solver crates to run unchanged in single-process, MPI-distributed, and future TCP or shared-memory configurations.

The crate currently provides two concrete backends:

  • local — single-process backend, always available, zero overhead, zero external dependencies.
  • mpi — MPI backend via ferrompi, feature-gated behind features = ["mpi"].

Two additional backend slots are reserved for future implementation:

  • tcp — TCP/IP coordinator pattern (no MPI required).
  • shm — POSIX shared memory for single-node multi-process execution.

The factory function create_communicator selects the backend at startup based on Cargo feature flags and an optional environment variable override. Downstream solver crates depend on the Communicator trait through a generic type parameter — never on a concrete backend type.

Module overview

ModulePurpose
traitsCore trait definitions: Communicator, SharedMemoryProvider, SharedRegion, CommData, LocalCommunicator
typesShared types: ReduceOp, CommError, BackendError
localLocalBackend (single-process) and HeapRegion (heap-backed shared region)
ferrompiFerrompiBackend — MPI backend (only compiled with features = ["mpi"])
factorycreate_communicator, BackendKind, CommBackend, available_backends

Communicator trait

#![allow(unused)]
fn main() {
pub trait Communicator: Send + Sync { ... }
}

The trait provides the six operations used during distributed computations: four collective operations and two infallible accessor methods. The trait is intentionally not object-safe — it carries generic methods (allgatherv<T>, allreduce<T>, broadcast<T>) that require static dispatch. This is the same monomorphization pattern used by SolverInterface in cobre-solver: callers parameterize a generic function once and the compiler generates one concrete instantiation per backend.

Since a Cobre binary uses exactly one communicator backend (MPI for distributed execution, LocalBackend for single-process mode), the binary contains only one instantiation per generic call site. The performance benefit is meaningful: LocalBackend’s no-op implementations compile to zero instructions after inlining.

Method summary

MethodSignatureReturnsDescription
allgatherv(&self, send, recv, counts, displs) -> Result<(), CommError>Result<(), CommError>Gather variable-length data from all ranks into all ranks
allreduce(&self, send, recv, op: ReduceOp) -> Result<(), CommError>Result<(), CommError>Element-wise reduction (sum, min, or max) across all ranks
broadcast(&self, buf, root: usize) -> Result<(), CommError>Result<(), CommError>Copy data from the root rank to all other ranks
barrier(&self) -> Result<(), CommError>Result<(), CommError>Block until all ranks have entered; pure synchronization
rank(&self) -> usizeusizeReturn this rank’s index (0..size); infallible
size(&self) -> usizeusizeReturn total number of ranks; infallible

Design: compile-time static dispatch (DEC-001)

Writing Box<dyn Communicator> does not compile — the trait is intentionally not object-safe. All callers use a generic type parameter:

#![allow(unused)]
fn main() {
use cobre_comm::{Communicator, CommError};

fn print_topology<C: Communicator>(comm: &C) {
    println!("rank {} of {}", comm.rank(), comm.size());
}
}

This is the mandated pattern for closed variant sets in Cobre (DEC-001). The dispatch overhead for CommBackend is a single branch-predictor-friendly integer comparison, negligible compared to the cost of the MPI collective operation or LP solve it wraps.

Thread safety

Communicator requires Send + Sync. All collective methods take &self (shared reference). Callers are responsible for serializing concurrent calls — the training loop ensures that multiple threads never invoke the same collective simultaneously on the same communicator instance. rank() and size() are safe to call concurrently: their values are cached at construction time and never change.

SharedMemoryProvider trait

#![allow(unused)]
fn main() {
pub trait SharedMemoryProvider: Send + Sync { ... }
}

SharedMemoryProvider is a companion trait to Communicator for managing intra-node shared memory regions. It is a separate trait rather than a supertrait of Communicator, which preserves flexibility: not all backends support true shared memory. Functions that only need collective communication use C: Communicator; functions that additionally need shared memory use C: Communicator + SharedMemoryProvider.

HeapRegion — the minimal viable region type

For the minimal viable implementation, all backends use HeapRegion<T> as their SharedMemoryProvider::Region<T> type. HeapRegion<T> is a thin wrapper around Vec<T>: each rank holds its own private heap allocation with no actual memory sharing between processes. The three-phase lifecycle (allocation, population, read-only) degenerates to simple Vec operations, with fence() a no-op.

True shared memory via MPI windows or POSIX shared memory segments is planned for a future optimization phase.

LocalCommunicator — object-safe intra-node coordination

LocalCommunicator is a purpose-built object-safe sub-trait that exposes only the three non-generic methods needed for intra-node initialization coordination:

#![allow(unused)]
fn main() {
use cobre_comm::LocalCommunicator;

fn determine_leader(local_comm: &dyn LocalCommunicator) -> bool {
    local_comm.rank() == 0
}
}

SharedMemoryProvider::split_local returns Box<dyn LocalCommunicator> — an intra-node communicator used only during initialization (leader/follower role assignment). Because this is an initialization-only operation far off the hot path, dynamic dispatch is the correct trade-off, and LocalCommunicator is the bridge that makes it possible without compromising the zero-cost static dispatch of the hot-path Communicator trait.

LocalBackend

#![allow(unused)]
fn main() {
pub struct LocalBackend;
}

LocalBackend is a zero-sized type (ZST) with no runtime state and no external dependencies. All collective operations use identity-copy or no-op semantics:

  • rank() always returns 0.
  • size() always returns 1.
  • allgatherv copies send into recv at the specified displacement (identity copy — with one rank, gather is trivial).
  • allreduce copies send to recv unchanged (reduction of a single operand is the identity).
  • broadcast is a no-op (data is already at the only rank).
  • barrier is a no-op (nothing to synchronize).

Because LocalBackend is a ZST, it occupies zero bytes at runtime and has no construction cost. Its collective method implementations compile to zero instructions after inlining in single-feature builds.

Example

#![allow(unused)]
fn main() {
use cobre_comm::{LocalBackend, Communicator, ReduceOp};

let comm = LocalBackend;
assert_eq!(comm.rank(), 0);
assert_eq!(comm.size(), 1);

// allreduce with one rank: identity copy regardless of op.
let send = vec![1.0_f64, 2.0, 3.0];
let mut recv = vec![0.0_f64; 3];
comm.allreduce(&send, &mut recv, ReduceOp::Sum).unwrap();
assert_eq!(recv, send);
}

LocalBackend also implements SharedMemoryProvider with HeapRegion<T> as the region type, and LocalCommunicator for use in intra-node initialization code.

FerrompiBackend

FerrompiBackend is the MPI backend, powered by the ferrompi crate. It is only compiled when features = ["mpi"] is specified:

# Cargo.toml
cobre-comm = { version = "0.1", features = ["mpi"] }

FerrompiBackend wraps a ferrompi::Mpi environment handle and an MPI_COMM_WORLD communicator. Construction calls MPI_Init_thread with ThreadLevel::Funneled, matching the Cobre execution model where only the main thread issues MPI calls. When FerrompiBackend is dropped, the RAII guard calls MPI_Finalize automatically.

FerrompiBackend requires an MPI runtime to be installed on the system. If no MPI runtime is found, FerrompiBackend::new() returns Err(BackendError::InitializationFailed).

The unsafe impl Send + Sync on FerrompiBackend reflects the fact that ferrompi::Mpi is !Send + !Sync by default (using a PhantomData<*const ()> marker), but the Cobre RAII pattern guarantees that construction and finalization happen on the same thread, making the impl sound.

Factory function: create_communicator

#![allow(unused)]
fn main() {
pub fn create_communicator() -> Result<impl Communicator, BackendError>
}

create_communicator is the single entry point for constructing a communicator at startup. It selects the backend according to:

  1. The COBRE_COMM_BACKEND environment variable (runtime override).
  2. The Cargo features compiled into the binary (auto-detection).
  3. A fallback to LocalBackend when no distributed backend is available or detected.

BackendKind enum

BackendKind is provided for library-mode callers (such as cobre-python or cobre-mcp) that need to select a backend programmatically rather than through environment variables:

VariantBehavior
BackendKind::AutoLet the factory choose the best available backend (default)
BackendKind::MpiRequest the MPI backend; fails if mpi feature is not compiled in
BackendKind::LocalAlways use LocalBackend, even when MPI is available

COBRE_COMM_BACKEND environment variable

ValueBehavior
(unset)Auto-detect: MPI if MPI launcher env vars are present, otherwise LocalBackend
"auto"Same as unset
"mpi"Use FerrompiBackend; fails if mpi feature is not compiled in
"local"Always use LocalBackend
"tcp"Reserved; returns BackendNotAvailable (no implementation yet)
"shm"Reserved; returns BackendNotAvailable (no implementation yet)

Auto-detection checks for the presence of MPI launcher environment variables (PMI_RANK, PMI_SIZE, OMPI_COMM_WORLD_RANK, OMPI_COMM_WORLD_SIZE, MPI_LOCALRANKID, SLURM_PROCID). If any of these is set, the factory attempts to initialize the MPI backend.

Example

#![allow(unused)]
fn main() {
use cobre_comm::{create_communicator, Communicator};

// With COBRE_COMM_BACKEND unset (auto-detect):
// - returns FerrompiBackend if launched via mpirun/mpiexec
// - returns LocalBackend otherwise
let comm = create_communicator().expect("backend selection failed");
println!("rank {} of {}", comm.rank(), comm.size());
}

When distributed features are compiled in, create_communicator returns a CommBackend enum that delegates each method call to the active concrete backend via a match. When no distributed features are compiled in, it returns LocalBackend directly.

CommBackend enum

CommBackend is the enum-dispatched communicator wrapper present in builds where at least one distributed backend feature (mpi, tcp, or shm) is compiled in. It implements both Communicator and SharedMemoryProvider by delegating each method to the active inner backend:

#![allow(unused)]
fn main() {
use cobre_comm::{create_communicator, Communicator};

// With COBRE_COMM_BACKEND=local, the factory returns CommBackend::Local.
let comm = create_communicator().expect("backend selection failed");
let send = [42.0_f64];
let mut recv = [0.0_f64];
comm.allgatherv(&send, &mut recv, &[1], &[0]).unwrap();
assert_eq!(recv[0], 42.0);
}

Error types

CommError

Returned by all fallible methods on Communicator and SharedMemoryProvider.

VariantWhen it occurs
CollectiveFailedAn MPI collective operation failed at the library level (carries MPI error code and description)
InvalidBufferSizeBuffer sizes provided to a collective are inconsistent (e.g., recv.len() < sum(counts) in allgatherv, or send.len() != recv.len() in allreduce)
InvalidRootThe root rank argument is out of range (root >= size())
InvalidCommunicatorThe communicator is in an invalid state (e.g., MPI has been finalized)
AllocationFailedA shared memory allocation request was rejected by the OS (size too large, insufficient permissions, or system limits exceeded)

BackendError

Returned by create_communicator when the backend cannot be selected or initialized.

VariantWhen it occurs
BackendNotAvailableThe requested backend is not compiled into this binary (e.g., COBRE_COMM_BACKEND=mpi without the mpi feature)
InvalidBackendThe COBRE_COMM_BACKEND value does not match any known backend name
InitializationFailedThe backend was correctly selected but failed to initialize (e.g., MPI runtime not installed)
MissingConfigurationRequired environment variables for the selected backend are not set (relevant for future tcp/shm backends)

Deferred features

The following features are planned but not yet implemented:

  • TCP backend ("tcp" feature): a TCP/IP coordinator pattern for distributed execution without requiring an MPI installation. Will follow the same Communicator trait interface.
  • Shared memory backend ("shm" feature): POSIX shared memory for single-node multi-process execution with zero inter-process copy overhead. Will implement SharedMemoryProvider using POSIX shared memory segments or MPI shared windows rather than the current HeapFallback semantics.

Feature flags

FeatureDefaultDescription
mpinoEnables FerrompiBackend and the ferrompi dependency
tcpnoReserved for the future TCP backend (no implementation yet)
shmnoReserved for the future shared memory backend

Without any feature flags, only LocalBackend, the trait definitions, and the type definitions are compiled. create_communicator returns LocalBackend directly (not wrapped in CommBackend).

Testing

Running the test suite

cargo test -p cobre-comm

This runs all unit, integration, and doc-tests for the default (no-feature) configuration. No MPI installation is required.

To run the full test suite including the MPI backend:

cargo test -p cobre-comm --features mpi

This requires an MPI runtime (libmpich-dev on Debian/Ubuntu, mpich on Fedora or macOS Homebrew). CI runs tests without the mpi feature by default; the MPI feature tests require a manual setup with an MPI installation.

Conformance suite (tests/conformance.rs)

The integration test file tests/conformance.rs implements the backend-agnostic conformance contract. It verifies the Communicator contract using only the public API against the LocalBackend concrete type. The conformance suite covers:

  • rank() returns 0 and size() returns 1 for single-process mode.
  • allgatherv copies send into recv at the correct displacement.
  • allreduce copies send to recv unchanged (identity for a single rank), for all three ReduceOp variants.
  • broadcast is a no-op for root == 0.
  • barrier returns Ok(()).
  • Buffer precondition violations return the correct CommError variants.
  • HeapRegion lifecycle: allocation, write via as_mut_slice, fence, and read via as_slice.
  • CommBackend::Local delegates all Communicator and SharedMemoryProvider methods correctly.

Design notes

Enum dispatch (DEC-001)

CommBackend uses enum dispatch rather than Box<dyn Communicator>. The Communicator trait carries generic methods that make it intentionally not object-safe. Enum dispatch is the mandated pattern for closed variant sets in Cobre (DEC-001): a single match arm delegates each method to the inner concrete type. The overhead is a single branch-predictor-friendly integer comparison per call, which is negligible compared to the cost of the underlying MPI collective or LP solve.

CommData conditional supertrait

The CommData marker trait — required for all types transmitted through collective operations — has a conditional supertrait:

  • With mpi feature: CommData additionally requires ferrompi::MpiDatatype, narrowing the set of valid types to the seven primitives that MPI can transmit directly (f32, f64, i32, i64, u8, u32, u64).
  • Without mpi feature: CommData accepts all Copy + Send + Sync + Default + 'static types, including bool and tuples used in tests.

This design avoids an extra bound on every method signature: FerrompiBackend can delegate directly to ferrompi’s generic FFI methods because the MpiDatatype constraint is already satisfied by CommData.

cfg-gate strategy

Backend modules and types are compiled only when their feature is enabled. The CommBackend enum is only present when at least one distributed feature (mpi, tcp, or shm) is compiled in — builds without distributed features use LocalBackend directly. This ensures that single-process builds have no code-size cost from unused backends.

cobre-sddp

experimental

cobre-sddp implements the Stochastic Dual Dynamic Programming (SDDP) algorithm (Pereira & Pinto, 1991) for long-term hydrothermal dispatch and energy planning. It is the first algorithm vertical in the Cobre ecosystem: a training loop that iteratively improves a piecewise-linear approximation of the value function for multi-stage stochastic linear programs.

For the mathematical foundations — including the Benders decomposition, cut coefficient derivation, and risk measure theory — see the methodology reference.

This crate depends on cobre-core for system data types, cobre-stochastic for inflow scenario generation, cobre-solver for LP subproblem solving, and cobre-comm for distributed communication.

Iteration lifecycle

Each training iteration follows a fixed eight-step sequence. The ordering reflects the correction introduced in the lower bound plan fix (F-019): the lower bound is evaluated after the backward pass and cut synchronization, not during forward synchronization.

┌─────────────────────────────────────────────────────────────────────────┐
│  Step 1  Forward pass                                                   │
│          Each rank simulates config.forward_passes scenarios through     │
│          all stages, solving the LP at each (scenario, stage) pair with  │
│          the current FCF approximation.                                  │
├─────────────────────────────────────────────────────────────────────────┤
│  Step 2  Forward sync                                                   │
│          allreduce (sum + broadcast) aggregates local UB statistics into │
│          a global mean, standard deviation, and 95% CI half-width.      │
├─────────────────────────────────────────────────────────────────────────┤
│  Step 3  State exchange                                                 │
│          allgatherv gathers all ranks' trial point state vectors so     │
│          every rank can solve the backward pass at ALL trial points.    │
├─────────────────────────────────────────────────────────────────────────┤
│  Step 4  Backward pass                                                  │
│          Sweeps stages T-2 down to 0, solving the successor LP under    │
│          every opening from the fixed tree, extracting LP duals to form  │
│          Benders cut coefficients, and inserting one cut per trial point  │
│          per stage into the Future Cost Function (FCF).                  │
├─────────────────────────────────────────────────────────────────────────┤
│  Step 5  Cut sync                                                       │
│          allgatherv shares each rank's newly generated cuts so that all  │
│          ranks maintain an identical FCF at the end of each iteration.  │
│                                                                         │
│  Step 5a Cut selection (optional)                                       │
│          When a CutSelectionStrategy is configured, inactive cuts are   │
│          pruned from the pool at multiples of check_frequency.          │
│                                                                         │
│  Step 5b LB evaluation                                                  │
│          Rank 0 solves the stage-0 LP for every opening in the tree    │
│          and aggregates the objectives via the stage-0 risk measure.    │
│          The scalar lower bound is broadcast to all ranks.              │
├─────────────────────────────────────────────────────────────────────────┤
│  Step 6  Convergence check                                              │
│          The ConvergenceMonitor updates bound statistics and evaluates   │
│          the configured stopping rules to determine whether to stop.    │
├─────────────────────────────────────────────────────────────────────────┤
│  Step 7  Checkpoint (deferred)                                          │
│          Periodic FCF checkpointing is planned for Phase 7. The MVP     │
│          does not write intermediate checkpoints.                       │
├─────────────────────────────────────────────────────────────────────────┤
│  Step 8  Event emission                                                 │
│          TrainingEvent values are sent to the optional event channel    │
│          for real-time monitoring by the CLI or TUI layer.              │
└─────────────────────────────────────────────────────────────────────────┘

The convergence gap is computed as:

gap = (UB - LB) / max(1.0, |UB|)

The max(1.0, |UB|) guard prevents division by zero when the upper bound is near zero.

Module overview

ModuleResponsibility
trainingtrain: the top-level loop orchestrator; wires all steps together
forwardrun_forward_pass, sync_forward: step 1 and step 2
state_exchangeExchangeBuffers: step 3 allgatherv of trial point state vectors
backwardrun_backward_pass: step 4 Benders cut generation across all trial points
cut_syncCutSyncBuffers: step 5 allgatherv of new cut wire records
cut_selectionCutSelectionStrategy, CutMetadata, DeactivationSet: step 5a pool pruning
lower_boundevaluate_lower_bound: step 5b risk-adjusted LB computation
convergenceConvergenceMonitor: step 6 bound tracking and stopping rule evaluation
cutCutPool, FutureCostFunction, CutWireHeader: cut data structures and wire format
configTrainingConfig: algorithm parameters
stopping_ruleStoppingRule, StoppingRuleSet, MonitorState: termination criteria
risk_measureRiskMeasure, BackwardOutcome: risk-neutral and CVaR aggregation
horizon_modeHorizonMode: finite vs. cyclic stage traversal (only Finite in MVP)
indexerStageIndexer: LP column and row offset arithmetic for stage subproblems
lp_builderPatchBuffer, ar_dynamics_row_offset: row-bound patch arrays for LP solves
trajectoryTrajectoryRecord: forward pass LP solution record (primal, dual, state, cost)
errorSddpError: unified error type aggregating solver, comm, stochastic, and I/O errors

Configuration

TrainingConfig

TrainingConfig controls the training loop parameters. All fields are public and must be set explicitly — there is no Default implementation, preventing silent misconfigurations.

FieldTypeDescription
forward_passesu32Scenarios per rank per iteration (must be >= 1)
max_iterationsu64Safety bound on total iterations; also sizes the cut pool
checkpoint_intervalOption<u64>Write checkpoint every N iterations; None = disabled
warm_start_cutsu32Pre-loaded cuts from a policy file
event_senderOption<Sender<TrainingEvent>>Channel for real-time monitoring events; None = silent
use cobre_sddp::TrainingConfig;

let config = TrainingConfig {
    forward_passes: 10,
    max_iterations: 500,
    checkpoint_interval: Some(50),
    warm_start_cuts: 0,
    event_sender: None,
};

StoppingRuleSet

The stopping rule set composes one or more termination criteria. Every set must include an IterationLimit rule as a safety bound against infinite loops.

Rule variantTrigger condition
IterationLimititeration >= limit
TimeLimitwall_time_seconds >= seconds
BoundStallingRelative LB improvement over a sliding window falls below tolerance
SimulationBasedPeriodic Monte Carlo simulation costs stabilize
GracefulShutdownExternal SIGTERM / SIGINT received (always evaluated first)

The mode field controls how multiple rules combine:

  • StoppingMode::Any (OR): stop when any rule triggers.
  • StoppingMode::All (AND): stop when all rules trigger simultaneously.
use cobre_sddp::stopping_rule::{StoppingMode, StoppingRule, StoppingRuleSet};

let stopping_rules = StoppingRuleSet {
    rules: vec![
        StoppingRule::IterationLimit { limit: 500 },
        StoppingRule::BoundStalling {
            tolerance: 0.001,
            iterations: 20,
        },
        StoppingRule::GracefulShutdown,
    ],
    mode: StoppingMode::Any,
};

RiskMeasure

RiskMeasure controls how per-opening backward pass outcomes are aggregated into Benders cuts and how the lower bound is computed.

VariantDescription
ExpectationRisk-neutral expected value. Weights equal opening probabilities.
CVaRConvex combination (1 - λ)·E[Z] + λ·CVaR_α[Z]. alpha ∈ (0, 1], lambda ∈ [0, 1].

alpha = 1 with CVaR is equivalent to Expectation. lambda = 0 with CVaR is also equivalent to Expectation. One RiskMeasure value is assigned per stage from the stages.json configuration field risk_measure.

CutSelectionStrategy

Cut selection is optional. When configured, it periodically prunes the cut pool to control memory growth during long training runs.

VariantDeactivation condition
Level1active_count <= threshold (never active; least aggressive)
Lml1iteration - last_active_iter > memory_window (outside time window)
DominatedDominated at all visited forward pass states (stub in MVP)

All variants respect a check_frequency parameter: selection only runs at iterations that are multiples of check_frequency and never at iteration 0.

Key data structures

FutureCostFunction

The Future Cost Function (FCF) holds one CutPool per stage. Each CutPool is a pre-allocated flat array of cut slots. Cuts are inserted deterministically by (iteration, forward_pass_index) to guarantee bit-for-bit identical FCF state across all MPI ranks.

The FCF is built once before training begins. Total slot capacity is warm_start_cuts + max_iterations * forward_passes * num_ranks per stage.

PatchBuffer

A PatchBuffer holds the three parallel arrays consumed by the LP solver’s set_row_bounds call. It is sized for N * (2 + L) patches, where N is the number of hydro plants and L is the maximum PAR order:

  • Category 1 [0, N) — storage-fixing: equality constraint at incoming storage.
  • Category 2 [N, N*(1+L)) — lag-fixing: equality constraint at AR lagged inflows.
  • Category 3 [N*(1+L), N*(2+L)) — noise-fixing: equality constraint at scenario noise.

The backward pass uses only categories 1 and 2 (fill_state_patches). The forward pass uses all three (fill_forward_patches).

ExchangeBuffers and CutSyncBuffers

Both types pre-allocate all communication buffers once at construction time and reuse them across all stages and iterations. This keeps the per-stage exchange allocation-free on the hot path.

ExchangeBuffers handles the state vector allgatherv (step 3):

  • Send buffer: local_count * n_state floats.
  • Receive buffer: local_count * num_ranks * n_state floats (rank-major order).

CutSyncBuffers handles the cut wire allgatherv (step 5):

  • Send buffer: max_cuts_per_rank * cut_wire_size(n_state) bytes.
  • Receive buffer: max_cuts_per_rank * num_ranks * cut_wire_size(n_state) bytes.

Convergence monitoring

ConvergenceMonitor tracks bound statistics and evaluates stopping rules. It is constructed once before the loop begins and updated at the end of each iteration via update(lb, &sync_result).

#![allow(unused)]
fn main() {
use cobre_sddp::convergence::ConvergenceMonitor;
use cobre_sddp::forward::SyncResult;
use cobre_sddp::stopping_rule::{StoppingMode, StoppingRule, StoppingRuleSet};

let rule_set = StoppingRuleSet {
    rules: vec![StoppingRule::IterationLimit { limit: 100 }],
    mode: StoppingMode::Any,
};

let mut monitor = ConvergenceMonitor::new(rule_set);

let sync = SyncResult {
    global_ub_mean: 110.0,
    global_ub_std: 5.0,
    ci_95_half_width: 2.0,
    sync_time_ms: 10,
};

let (stop, results) = monitor.update(100.0, &sync);
assert!(!stop);
assert_eq!(monitor.iteration_count(), 1);
// gap = (110 - 100) / max(1.0, 110.0) = 10/110
assert!((monitor.gap() - 10.0 / 110.0).abs() < 1e-10);
}

Accessor methods on ConvergenceMonitor:

MethodReturns
lower_bound()Latest LB value
upper_bound()Latest UB mean
upper_bound_std()Latest UB standard deviation
ci_95_half_width()Latest 95% CI half-width
gap()Convergence gap: (UB - LB) / max(1.0, abs(UB))
iteration_count()Number of completed update calls
set_shutdown()Signal a graceful shutdown before next update

Event system

The training loop emits TrainingEvent values (from cobre-core) at each lifecycle step boundary when config.event_sender is Some. Events carry structured data for real-time display in the TUI or CLI layers.

Key events emitted during training:

Event variantWhen emitted
ForwardPassCompleteAfter step 1 completes for all local scenarios
ForwardSyncCompleteAfter step 2 global UB statistics are merged
BackwardPassCompleteAfter step 4 cut generation for all trial points
CutSyncCompleteAfter step 5 cut allgatherv
CutSelectionCompleteAfter step 5a pool pruning (when strategy is set)
LowerBoundEvaluatedAfter step 5b LB broadcast
IterationSummaryAt the end of each iteration (LB, UB, gap, timing)
TrainingFinishedWhen a stopping rule triggers

Quick start (pseudocode)

The following shows the shape of a train call. All arguments must be built from the upstream pipeline (cobre-io for system data, cobre-stochastic for the opening tree, cobre-solver for the LP solver instance).

use cobre_sddp::{
    FutureCostFunction, HorizonMode, RiskMeasure, StageIndexer,
    TrainingConfig, TrainingResult,
    stopping_rule::{StoppingMode, StoppingRule, StoppingRuleSet},
    train,
};

// Build the FCF for num_stages stages, n_state state dimensions,
// forward_passes scenarios per rank, max_iterations iterations.
let mut fcf = FutureCostFunction::new(num_stages, n_state, forward_passes, max_iterations, 0);

let config = TrainingConfig {
    forward_passes: 10,
    max_iterations: 500,
    checkpoint_interval: None,
    warm_start_cuts: 0,
    event_sender: None,
};

let stopping_rules = StoppingRuleSet {
    rules: vec![
        StoppingRule::IterationLimit { limit: 500 },
        StoppingRule::GracefulShutdown,
    ],
    mode: StoppingMode::Any,
};

let horizon = HorizonMode::Finite { num_stages };

let result: TrainingResult = train(
    &mut solver,        // SolverInterface impl (e.g., HiGHS)
    config,
    &mut fcf,
    &templates,         // one StageTemplate per stage
    &base_rows,         // AR dynamics base row index per stage
    &indexer,           // StageIndexer from StageIndexer::new(n_hydro, max_par_order)
    &initial_state,     // known initial storage volumes
    &opening_tree,      // from cobre_stochastic::build_stochastic_context
    &stochastic,        // StochasticContext
    &horizon,
    &risk_measures,     // one RiskMeasure per stage
    stopping_rules,
    None,               // no cut selection in this example
    None,               // no external shutdown flag
    &comm,              // Communicator (LocalBackend or FerrompiBackend)
)?;

println!(
    "Converged in {} iterations: LB={:.2}, UB={:.2}, gap={:.4}",
    result.iterations, result.final_lb, result.final_ub, result.final_gap
);

Error handling

All fallible operations return Result<T, SddpError>. The error type is Send + Sync + 'static and can be propagated across thread boundaries or wrapped by anyhow.

SddpError variantTrigger
InfeasibleLP has no feasible solution (stage, iteration, scenario)
SolverLP solve failed for numerical or timeout reasons
CommunicationMPI collective operation failed
StochasticScenario generation or PAR model validation failed
IoCase directory loading or validation failed
ValidationAlgorithm configuration is semantically invalid

Performance notes

Pre-allocation discipline

The training loop makes no heap allocations on the hot path inside the iteration loop. All workspace buffers are allocated once before the loop:

  • TrajectoryRecord flat vec: forward_passes * num_stages records.
  • PatchBuffer: N * (2 + L) entries.
  • ExchangeBuffers: local_count * num_ranks * n_state floats.
  • CutSyncBuffers: max_cuts_per_rank * num_ranks * cut_wire_size(n_state) bytes.

Cut wire format

The cut wire format used by CutSyncBuffers is a fixed-size record: 24 bytes of header (slot index, iteration, forward pass index, intercept) followed by n_state * 8 bytes of coefficients. The record size is cut_wire_size(n_state) = 24 + n_state * 8 bytes.

Communication-free parallelism

Forward pass noise is generated without inter-rank communication. Each rank independently derives its noise seed from (base_seed, iteration, scenario, stage_id) using SipHash-1-3 (DEC-017 from cobre-stochastic). The opening tree is pre-generated once before training and shared read-only across all iterations.

Testing

cargo test -p cobre-sddp --all-features

The crate requires no external system libraries beyond what is needed by the workspace (HiGHS is always available; MPI is optional via the mpi feature of cobre-comm).

Test suite overview

The crate has tests across 15 source modules covering:

  • Unit tests for each module’s core logic.
  • Integration tests using LocalBackend (single-rank) for the communication-involving modules (forward, backward, cut_sync, state_exchange, lower_bound, training).
  • Doc-tests for all public types and functions with constructible examples.

Feature flags

cobre-sddp has no optional feature flags of its own. Feature flag propagation from cobre-comm (the mpi feature) controls whether MPI-based distributed training is available at link time.

# Cargo.toml
cobre-sddp = { version = "0.0.1" }

cobre-cli

cobre-cli provides the cobre binary: the command-line interface for running SDDP studies, validating input data, and inspecting results. It ties together cobre-io, cobre-stochastic, cobre-solver, cobre-comm, and cobre-sddp into a single executable with a consistent user interface.

Subcommands

SubcommandDescription
cobre run <CASE_DIR>Load a case, train an SDDP policy, optionally simulate, and write all results
cobre validate <CASE_DIR>Run the 5-layer validation pipeline and print a structured diagnostic report
cobre report <RESULTS_DIR>Read result manifests and print a machine-readable JSON summary to stdout
cobre versionPrint version, solver backend, communication backend, and build information
cobre init <DIRECTORY>Scaffold a new case directory from an embedded template

Exit Code Contract

All subcommands map failures to a typed exit code through the CliError type. The mapping is stable across releases:

Exit CodeCategoryCause
0SuccessCommand completed without errors
1ValidationCase directory failed validation
2I/OFilesystem error during loading or output
3SolverLP infeasible or numerical solver failure
4InternalCommunication failure or unexpected state

This contract enables cobre run to be driven from shell scripts and batch schedulers by inspecting the process exit code.

Output and Terminal Behavior

cobre run writes a progress bar to stderr and a run summary after completion (both suppressed in --quiet mode). Error messages are always written to stderr.

cobre report prints pretty-printed JSON to stdout, suitable for piping to jq.

cobre init

Scaffolds a new case directory from a built-in template. This is the recommended way to start a new study: the template provides a complete, valid case that passes cobre validate out of the box and can be run immediately with cobre run.

Arguments

ArgumentRequiredDescription
<DIRECTORY>Yes (unless --list)Path where the case directory will be created

Options

OptionDescription
--template <NAME>Template name to scaffold. Required unless --list is given.
--listList all available templates and exit. Mutually exclusive with --template.
--forceOverwrite existing files in the target directory if it is non-empty.

Available Templates

TemplateDescription
1dtoySingle-bus hydrothermal system: 4 stages, 1 hydro plant, 2 thermals

Usage Examples

# List all available templates
cobre init --list

# Scaffold the 1dtoy template into a new directory
cobre init --template 1dtoy my_study

# Overwrite an existing directory
cobre init --template 1dtoy my_study --force

After scaffolding, validate and run the case:

cobre validate my_study
cobre run my_study --output my_study/results

Error Behavior

  • Unknown template name: exits with code 1 and lists available templates.
  • Target directory is non-empty and --force is not set: exits with code 2.
  • Write failure: exits with code 2 with the failing path in the error message.

ferrompi

experimental

Safe MPI 4.x bindings for Rust, used by cobre-comm as the MPI communication backend. This is a separate repository at github.com/cobre-rs/ferrompi.

ferrompi provides type-safe wrappers around MPI collective operations (allgatherv, allreduce, broadcast, barrier) with RAII-managed MPI_Init_thread / MPI_Finalize lifecycle. It supports ThreadLevel::Funneled initialization, which matches the Cobre execution model where only the main thread issues MPI calls.

See the ferrompi README and the backend specification for details.

Case Format Reference

A Cobre case directory is a self-contained folder that holds all input data for a single power system study. load_case reads this directory and produces a fully-validated System ready for the solver.

For a description of how these files are parsed and validated, see cobre-io.

JSON Schema files for all JSON input types are available on the Schemas page. Download them for use with your editor’s JSON Schema validation feature.

Directory layout

my_case/
├── config.json                              # Solver configuration (required)
├── penalties.json                           # Global penalty defaults (required)
├── stages.json                              # Stage sequence and policy graph (required)
├── initial_conditions.json                  # Reservoir storage at study start (required)
├── system/
│   ├── buses.json                           # Electrical buses (required)
│   ├── lines.json                           # Transmission lines (required)
│   ├── hydros.json                          # Hydro plants (required)
│   ├── thermals.json                        # Thermal plants (required)
│   ├── non_controllable_sources.json        # Intermittent sources (optional)
│   ├── pumping_stations.json                # Pumping stations (optional)
│   ├── energy_contracts.json                # Bilateral contracts (optional)
│   ├── hydro_geometry.parquet               # Reservoir geometry tables (optional)
│   ├── hydro_production_models.json         # FPHA production function configs (optional)
│   └── fpha_hyperplanes.parquet             # FPHA hyperplane coefficients (optional)
├── scenarios/
│   ├── inflow_history.parquet               # Historical inflow series (optional)
│   ├── inflow_seasonal_stats.parquet        # PAR model seasonal statistics (optional)
│   ├── inflow_ar_coefficients.parquet       # PAR autoregressive coefficients (optional)
│   ├── external_scenarios.parquet           # Pre-generated external scenarios (optional)
│   ├── load_seasonal_stats.parquet          # Load model seasonal statistics (optional)
│   ├── load_factors.json                    # Load scaling factors (optional)
│   └── correlation.json                     # Cross-series correlation model (optional)
└── constraints/
    ├── thermal_bounds.parquet               # Stage-varying thermal bounds (optional)
    ├── hydro_bounds.parquet                 # Stage-varying hydro bounds (optional)
    ├── line_bounds.parquet                  # Stage-varying line bounds (optional)
    ├── pumping_bounds.parquet               # Stage-varying pumping bounds (optional)
    ├── contract_bounds.parquet              # Stage-varying contract bounds (optional)
    ├── exchange_factors.json                # Block exchange factors (optional)
    ├── generic_constraints.json             # User-defined LP constraints (optional)
    ├── generic_constraint_bounds.parquet    # Bounds for generic constraints (optional)
    ├── penalty_overrides_bus.parquet        # Stage-varying bus penalty overrides (optional)
    ├── penalty_overrides_line.parquet       # Stage-varying line penalty overrides (optional)
    ├── penalty_overrides_hydro.parquet      # Stage-varying hydro penalty overrides (optional)
    └── penalty_overrides_ncs.parquet        # Stage-varying NCS penalty overrides (optional)

File summary

FileFormatRequiredDescription
config.jsonJSONYesSolver configuration
penalties.jsonJSONYesGlobal penalty defaults
stages.jsonJSONYesStage sequence and policy graph
initial_conditions.jsonJSONYesInitial reservoir storage
system/buses.jsonJSONYesElectrical bus registry
system/lines.jsonJSONYesTransmission line registry
system/hydros.jsonJSONYesHydro plant registry
system/thermals.jsonJSONYesThermal plant registry
system/non_controllable_sources.jsonJSONNoIntermittent source registry
system/pumping_stations.jsonJSONNoPumping station registry
system/energy_contracts.jsonJSONNoBilateral energy contract registry
system/hydro_geometry.parquetParquetNoReservoir geometry elevation tables
system/hydro_production_models.jsonJSONNoFPHA production function configs
system/fpha_hyperplanes.parquetParquetNoFPHA hyperplane coefficients
scenarios/inflow_history.parquetParquetNoHistorical inflow time series
scenarios/inflow_seasonal_stats.parquetParquetNoPAR model seasonal statistics
scenarios/inflow_ar_coefficients.parquetParquetNoPAR autoregressive coefficients
scenarios/external_scenarios.parquetParquetNoPre-generated scenario inflows
scenarios/load_seasonal_stats.parquetParquetNoLoad model seasonal statistics
scenarios/load_factors.jsonJSONNoLoad scaling factors per bus/stage
scenarios/correlation.jsonJSONNoCross-series correlation model
constraints/thermal_bounds.parquetParquetNoStage-varying thermal generation bounds
constraints/hydro_bounds.parquetParquetNoStage-varying hydro operational bounds
constraints/line_bounds.parquetParquetNoStage-varying line flow capacity
constraints/pumping_bounds.parquetParquetNoStage-varying pumping flow bounds
constraints/contract_bounds.parquetParquetNoStage-varying contract power bounds
constraints/exchange_factors.jsonJSONNoBlock exchange factors
constraints/generic_constraints.jsonJSONNoUser-defined LP constraints
constraints/generic_constraint_bounds.parquetParquetNoGeneric constraint RHS bounds
constraints/penalty_overrides_bus.parquetParquetNoStage-varying bus excess cost
constraints/penalty_overrides_line.parquetParquetNoStage-varying line exchange cost
constraints/penalty_overrides_hydro.parquetParquetNoStage-varying hydro penalty costs
constraints/penalty_overrides_ncs.parquetParquetNoStage-varying NCS curtailment cost

Root-level files

config.json

Controls all solver parameters. The training section is required; all other sections are optional and fall back to documented defaults when absent.

Top-level sections:

SectionTypeDefaultPurpose
$schemastringnullJSON Schema URI for editor validation (ignored during processing)
modelingobject{}Inflow non-negativity treatment
trainingobjectrequiredIteration count, stopping rules, cut selection
upper_bound_evaluationobject{}Inner approximation upper-bound settings
policyobjectfresh modePolicy directory path and warm-start mode
simulationobjectdisabledPost-training simulation settings
exportsobjectall enabledOutput file selection flags

modeling section:

FieldTypeDefaultDescription
modeling.inflow_non_negativity.methodstring"penalty"How to handle negative modelled inflows. One of "none", "penalty", "truncation", "truncation_with_penalty"
modeling.inflow_non_negativity.penalty_costnumber1000.0Penalty coefficient when method is "penalty" or "truncation_with_penalty"

training section (mandatory fields):

FieldTypeDefaultDescription
training.forward_passesintegerrequiredNumber of scenario trajectories per iteration (>= 1)
training.stopping_rulesarrayrequiredAt least one stopping rule entry; must include an iteration_limit rule
training.stopping_modestring"any"How multiple rules combine: "any" (stop when any triggers) or "all" (stop when all trigger)
training.enabledbooleantrueWhen false, skip training and proceed directly to simulation
training.seedinteger or nullnullRandom seed for reproducible scenario generation
training.cut_formulationstring or nullnullCut type: "single" or "multi"

training.stopping_rules entries:

Each entry has a "type" discriminator. Valid types:

TypeRequired fieldsStops when
iteration_limitlimit: integerIteration count reaches limit
time_limitseconds: numberWall-clock time exceeds seconds
bound_stallingiterations: integer, tolerance: numberLower bound improvement falls below tolerance over iterations window
simulationreplications, period, bound_window, distance_tol, bound_tolBoth policy cost and bound have stabilized

training.cut_selection sub-section:

FieldTypeDefaultDescription
enabledbooleannullEnable cut pruning
methodstringnullPruning method: "level1", "lml1", or "domination"
thresholdintegernullMinimum iterations before first pruning pass
check_frequencyintegernullIterations between pruning checks
cut_activity_tolerancenumbernullMinimum dual multiplier for a cut to count as binding

upper_bound_evaluation section:

FieldTypeDefaultDescription
enabledbooleannullEnable vertex-based inner approximation
initial_iterationintegernullFirst iteration to compute the upper bound
interval_iterationsintegernullIterations between upper-bound evaluations
lipschitz.modestringnullLipschitz constant computation mode: "auto"
lipschitz.fallback_valuenumbernullFallback when automatic computation fails
lipschitz.scale_factornumbernullMultiplicative safety margin

policy section:

FieldTypeDefaultDescription
pathstring"./policy"Directory for policy data (cuts, states, vertices, basis)
modestring"fresh"Initialization mode: "fresh", "warm_start", or "resume"
validate_compatibilitybooleantrueVerify entity and dimension compatibility when loading a stored policy
checkpointing.enabledbooleannullEnable periodic checkpointing
checkpointing.initial_iterationintegernullFirst iteration to write a checkpoint
checkpointing.interval_iterationsintegernullIterations between checkpoints
checkpointing.store_basisbooleannullInclude LP basis in checkpoints
checkpointing.compressbooleannullCompress checkpoint files

simulation section:

FieldTypeDefaultDescription
enabledbooleanfalseEnable post-training simulation
num_scenariosinteger2000Number of simulation scenarios
policy_typestring"outer"Policy representation: "outer" (cuts) or "inner" (vertices)
output_pathstring or nullnullDirectory for simulation output files
output_modestring or nullnullOutput mode: "streaming" or "batched"
io_channel_capacityinteger64Channel capacity between simulation and I/O writer threads
sampling_scheme.typestring"in_sample"Scenario scheme: "in_sample", "out_of_sample", or "external"

exports section:

FieldTypeDefaultDescription
trainingbooleantrueExport training summary metrics
cutsbooleantrueExport cut pool (outer approximation)
statesbooleantrueExport visited states
verticesbooleantrueExport inner approximation vertices
simulationbooleantrueExport simulation results
forward_detailbooleanfalseExport per-scenario forward-pass detail
backward_detailbooleanfalseExport per-scenario backward-pass detail
compressionstring or nullnullOutput compression: "zstd", "lz4", or "none"

Minimal valid example:

{
  "$schema": "https://cobre-rs.github.io/cobre/schemas/config.schema.json",
  "training": {
    "forward_passes": 192,
    "stopping_rules": [{ "type": "iteration_limit", "limit": 200 }]
  }
}

penalties.json

Global penalty cost defaults used when no entity-level override is present. All four sections are required. Every scalar cost must be strictly positive (> 0.0). Deficit segment costs must be monotonically increasing and the last segment must have depth_mw: null (unbounded).

SectionFieldTypeDescription
busdeficit_segmentsarrayPiecewise-linear deficit cost tiers
busdeficit_segments[].depth_mwnumber or nullSegment depth (MW); null for the final unbounded segment
busdeficit_segments[].costnumberCost per MWh of deficit in this tier (USD/MWh)
busexcess_costnumberCost per MWh of excess injection (USD/MWh)
lineexchange_costnumberCost per MWh of inter-bus exchange flow (USD/MWh)
hydrospillage_costnumberSpillage penalty
hydrofpha_turbined_costnumberFPHA turbined flow violation penalty
hydrodiversion_costnumberDiversion flow penalty
hydrostorage_violation_below_costnumberStorage below-minimum violation penalty
hydrofilling_target_violation_costnumberFilling target violation penalty
hydroturbined_violation_below_costnumberTurbined flow below-minimum violation penalty
hydrooutflow_violation_below_costnumberTotal outflow below-minimum violation penalty
hydrooutflow_violation_above_costnumberTotal outflow above-maximum violation penalty
hydrogeneration_violation_below_costnumberGeneration below-minimum violation penalty
hydroevaporation_violation_costnumberEvaporation violation penalty
hydrowater_withdrawal_violation_costnumberWater withdrawal violation penalty
non_controllable_sourcecurtailment_costnumberCurtailment penalty (USD/MWh)

Example:

{
  "$schema": "https://cobre-rs.github.io/cobre/schemas/penalties.schema.json",
  "bus": {
    "deficit_segments": [
      { "depth_mw": 500.0, "cost": 1000.0 },
      { "depth_mw": null, "cost": 5000.0 }
    ],
    "excess_cost": 100.0
  },
  "line": { "exchange_cost": 2.0 },
  "hydro": {
    "spillage_cost": 0.01,
    "fpha_turbined_cost": 0.05,
    "diversion_cost": 0.1,
    "storage_violation_below_cost": 10000.0,
    "filling_target_violation_cost": 50000.0,
    "turbined_violation_below_cost": 500.0,
    "outflow_violation_below_cost": 500.0,
    "outflow_violation_above_cost": 500.0,
    "generation_violation_below_cost": 1000.0,
    "evaporation_violation_cost": 5000.0,
    "water_withdrawal_violation_cost": 1000.0
  },
  "non_controllable_source": { "curtailment_cost": 0.005 }
}

stages.json

Defines the temporal structure of the study: stage sequence, block decomposition, policy graph horizon type, and scenario source configuration.

Top-level fields:

FieldRequiredDescription
policy_graphYesHorizon type ("finite_horizon"), annual discount rate, and stage transitions
stagesYesArray of study stage definitions
scenario_sourceNoTop-level sampling scheme and seed
season_definitionsNoSeason labeling for seasonal model alignment
pre_study_stagesNoPre-study stages for AR model warm-up (negative IDs)

stages[] entry fields:

FieldRequiredDescription
idYesStage identifier (non-negative integer, unique)
start_dateYesISO 8601 date (e.g., "2024-01-01")
end_dateYesISO 8601 date; must be after start_date
blocksYesArray of load blocks (id, name, hours)
num_scenariosYesNumber of forward-pass scenarios for this stage (>= 1)
season_idNoReference to a season in season_definitions
block_modeNoBlock execution mode: "parallel" (default) or "sequential"
state_variablesNoWhich state variables are active: storage, inflow_lags
risk_measureNoPer-stage risk measure: "expectation" or CVaR config
sampling_methodNoNoise method: "saa" or other variants

initial_conditions.json

Initial reservoir storage values at the start of the study.

FieldRequiredDescription
storageYesArray of { "hydro_id": integer, "value_hm3": number } entries for operating hydros
filling_storageYesArray of { "hydro_id": integer, "value_hm3": number } entries for filling hydros

Each hydro_id must be unique within its array and must not appear in both arrays. All value_hm3 values must be non-negative.


system/ files

system/buses.json

Electrical bus registry. Buses are the nodes of the transmission network.

FieldRequiredDescription
buses[].idYesBus identifier (integer, unique)
buses[].nameYesHuman-readable bus name (string)
buses[].deficit_segmentsNoEntity-level deficit cost tiers; when absent, global defaults from penalties.json apply
buses[].deficit_segments[].depth_mwNoSegment MW depth; null for the final unbounded segment
buses[].deficit_segments[].costNoCost per MWh of deficit in this tier (USD/MWh)

system/lines.json

Transmission line registry. Lines connect buses and carry power flows.

FieldRequiredDescription
lines[].idYesLine identifier (integer, unique)
lines[].nameYesHuman-readable line name (string)
lines[].source_bus_idYesSending-end bus ID
lines[].target_bus_idYesReceiving-end bus ID
lines[].direct_mwYesMaximum power flow in the direct direction (MW)
lines[].reverse_mwYesMaximum power flow in the reverse direction (MW)

system/hydros.json

Hydro plant registry. Each entry defines a complete hydro plant with reservoir, turbine, and optional cascade linkage.

Key fields:

FieldRequiredDescription
hydros[].idYesPlant identifier (integer, unique)
hydros[].nameYesHuman-readable plant name
hydros[].bus_idYesBus where generation is injected
hydros[].downstream_idNoDownstream plant ID in the cascade; null = tailwater
hydros[].reservoirYesmin_storage_hm3 and max_storage_hm3 (both >= 0)
hydros[].outflowYesmin_outflow_m3s and max_outflow_m3s total outflow bounds
hydros[].generationYesGeneration model: model, turbine flow bounds, generation MW bounds
hydros[].generation.modelYesCurrently: "constant_productivity"
hydros[].generation.productivity_mw_per_m3sYes (for constant)Turbine productivity factor
hydros[].penaltiesNoEntity-level hydro penalty overrides

system/thermals.json

Thermal plant registry. Each entry defines a dispatchable generation unit.

FieldRequiredDescription
thermals[].idYesPlant identifier (integer, unique)
thermals[].nameYesHuman-readable plant name
thermals[].bus_idYesBus where generation is injected
thermals[].min_generation_mwYesMinimum dispatch level (MW)
thermals[].max_generation_mwYesMaximum dispatch level (MW)
thermals[].cost_per_mwhYesLinear generation cost (USD/MWh)

scenarios/ files (Parquet)

scenarios/inflow_seasonal_stats.parquet

PAR(p) model seasonal statistics for each (hydro plant, stage) pair.

ColumnTypeRequiredDescription
hydro_idINT32YesHydro plant ID
stage_idINT32YesStage ID
mean_m3sDOUBLEYesSeasonal mean inflow (m³/s); must be finite
std_m3sDOUBLEYesSeasonal standard deviation (m³/s); must be >= 0 and finite
ar_orderINT32YesAR model order (number of lags); must be >= 0

scenarios/inflow_ar_coefficients.parquet

Autoregressive coefficients for the PAR(p) inflow model.

ColumnTypeRequiredDescription
hydro_idINT32YesHydro plant ID
stage_idINT32YesStage ID
lagINT32YesLag index (1-based)
coefficientDOUBLEYesAR coefficient for this (hydro, stage, lag)

constraints/ files (Parquet)

All bounds Parquet files use sparse storage: only (entity_id, stage_id) pairs that differ from the base entity-level value need rows. Absent rows use the entity-level value unchanged.

constraints/thermal_bounds.parquet

Stage-varying generation bound overrides for thermal plants.

ColumnTypeRequiredDescription
thermal_idINT32YesThermal plant ID
stage_idINT32YesStage ID
min_generation_mwDOUBLENoMinimum generation override (MW)
max_generation_mwDOUBLENoMaximum generation override (MW)

constraints/hydro_bounds.parquet

Stage-varying operational bound overrides for hydro plants.

ColumnTypeRequiredDescription
hydro_idINT32YesHydro plant ID
stage_idINT32YesStage ID
min_turbined_m3sDOUBLENoMinimum turbined flow (m³/s)
max_turbined_m3sDOUBLENoMaximum turbined flow (m³/s)
min_storage_hm3DOUBLENoMinimum reservoir storage (hm³)
max_storage_hm3DOUBLENoMaximum reservoir storage (hm³)
min_outflow_m3sDOUBLENoMinimum total outflow (m³/s)
max_outflow_m3sDOUBLENoMaximum total outflow (m³/s)
min_generation_mwDOUBLENoMinimum generation (MW)
max_generation_mwDOUBLENoMaximum generation (MW)
max_diversion_m3sDOUBLENoMaximum diversion flow (m³/s)
filling_inflow_m3sDOUBLENoFilling inflow override (m³/s)
water_withdrawal_m3sDOUBLENoWater withdrawal (m³/s)

constraints/line_bounds.parquet

Stage-varying flow capacity overrides for transmission lines.

ColumnTypeRequiredDescription
line_idINT32YesTransmission line ID
stage_idINT32YesStage ID
direct_mwDOUBLENoDirect-flow capacity override (MW)
reverse_mwDOUBLENoReverse-flow capacity override (MW)

constraints/pumping_bounds.parquet

Stage-varying flow bounds for pumping stations.

ColumnTypeRequiredDescription
station_idINT32YesPumping station ID
stage_idINT32YesStage ID
min_m3sDOUBLENoMinimum pumping flow (m³/s)
max_m3sDOUBLENoMaximum pumping flow (m³/s)

constraints/contract_bounds.parquet

Stage-varying power and price overrides for energy contracts.

ColumnTypeRequiredDescription
contract_idINT32YesEnergy contract ID
stage_idINT32YesStage ID
min_mwDOUBLENoMinimum power (MW)
max_mwDOUBLENoMaximum power (MW)
price_per_mwhDOUBLENoPrice override (USD/MWh)

Penalty override files

All penalty override files use sparse storage. Only rows for (entity_id, stage_id) pairs where the penalty differs from the entity-level or global default are required. All penalty values must be strictly positive (> 0.0) and finite.

constraints/penalty_overrides_bus.parquet

ColumnTypeRequiredDescription
bus_idINT32YesBus ID
stage_idINT32YesStage ID
excess_costDOUBLENoExcess injection cost override (USD/MWh)

Note: Bus deficit segments are not stage-varying. Only excess_cost can be overridden per stage for buses.


constraints/penalty_overrides_line.parquet

ColumnTypeRequiredDescription
line_idINT32YesTransmission line ID
stage_idINT32YesStage ID
exchange_costDOUBLENoExchange flow cost override (USD/MWh)

constraints/penalty_overrides_hydro.parquet

ColumnTypeRequiredDescription
hydro_idINT32YesHydro plant ID
stage_idINT32YesStage ID
spillage_costDOUBLENoSpillage penalty override
fpha_turbined_costDOUBLENoFPHA turbined flow violation override
diversion_costDOUBLENoDiversion penalty override
storage_violation_below_costDOUBLENoStorage below-minimum violation override
filling_target_violation_costDOUBLENoFilling target violation override
turbined_violation_below_costDOUBLENoTurbined below-minimum violation override
outflow_violation_below_costDOUBLENoOutflow below-minimum violation override
outflow_violation_above_costDOUBLENoOutflow above-maximum violation override
generation_violation_below_costDOUBLENoGeneration below-minimum violation override
evaporation_violation_costDOUBLENoEvaporation violation override
water_withdrawal_violation_costDOUBLENoWater withdrawal violation override

constraints/penalty_overrides_ncs.parquet

ColumnTypeRequiredDescription
source_idINT32YesNon-controllable source ID
stage_idINT32YesStage ID
curtailment_costDOUBLENoCurtailment penalty override (USD/MWh)

Output Format Reference

This page is the exhaustive schema reference for every file produced by cobre run. It documents column names, Arrow data types, nullability, JSON field structures, and binary format layouts for all 10 Parquet schemas, the two manifest types, the training metadata file, the five dictionary files, and the policy checkpoint format.

If you are new to Cobre output, start with Understanding Results first. That page explains what each file means conceptually and shows how to read results programmatically. This page is for readers who need the precise schema definition — for writing parsers, building dashboards, or implementing compatibility checks.


Output Directory Tree

A complete cobre run produces the following directory structure. Not every entity directory appears in every run: cobre run only writes directories for entity types present in the case. For example, a case with no pumping stations will not produce simulation/pumping_stations/.

<output_dir>/
  training/
    _manifest.json
    metadata.json
    convergence.parquet
    dictionaries/
      codes.json
      entities.csv
      variables.csv
      bounds.parquet
      state_dictionary.json
    timing/
      iterations.parquet
      mpi_ranks.parquet
  policy/
    cuts/
      stage_000.bin
      stage_001.bin
      ...
      stage_NNN.bin
    basis/
      stage_000.bin
      stage_001.bin
      ...
      stage_NNN.bin
    metadata.json
  simulation/
    _manifest.json
    costs/
      scenario_id=0000/
        data.parquet
      scenario_id=0001/
        data.parquet
      ...
    hydros/
      scenario_id=0000/data.parquet
      ...
    thermals/
      scenario_id=0000/data.parquet
      ...
    exchanges/
      scenario_id=0000/data.parquet
      ...
    buses/
      scenario_id=0000/data.parquet
      ...
    pumping_stations/
      scenario_id=0000/data.parquet
      ...
    contracts/
      scenario_id=0000/data.parquet
      ...
    non_controllables/
      scenario_id=0000/data.parquet
      ...
    inflow_lags/
      scenario_id=0000/data.parquet
      ...
    violations/
      generic/
        scenario_id=0000/data.parquet
        ...

Training Output

training/_manifest.json

The training manifest is written atomically at the end of the training run (and updated on each checkpoint if checkpointing is enabled). Consumers should read status before interpreting any other field.

JSON structure:

{
  "version": "2.0.0",
  "status": "complete",
  "started_at": "2026-01-17T08:00:00Z",
  "completed_at": "2026-01-17T12:30:00Z",
  "iterations": {
    "max_iterations": 200,
    "completed": 128,
    "converged_at": null
  },
  "convergence": {
    "achieved": false,
    "final_gap_percent": 0.45,
    "termination_reason": "iteration_limit"
  },
  "cuts": {
    "total_generated": 1250000,
    "total_active": 980000,
    "peak_active": 1100000
  },
  "checksum": null,
  "mpi_info": {
    "world_size": 1,
    "ranks_participated": 1
  }
}

Field reference:

FieldTypeNullableDescription
versionstringNoManifest schema version. Current value: "2.0.0".
statusstringNoRun status: "running", "complete", "failed", or "converged".
started_atstringYesISO 8601 timestamp when training started. null in minimal viable version.
completed_atstringYesISO 8601 timestamp when training finished. null while running.
iterations.max_iterationsintegerYesMaximum iterations allowed by the iteration-limit stopping rule. null if no limit was configured.
iterations.completedintegerNoNumber of training iterations that finished.
iterations.converged_atintegerYesIteration number at which a convergence stopping rule triggered termination. null if training was terminated by a safety limit (e.g. iteration limit).
convergence.achievedbooleanNotrue if a convergence-oriented stopping rule terminated the run.
convergence.final_gap_percentnumberYesOptimality gap between lower and upper bounds at termination, expressed as a percentage. null when upper bound evaluation is disabled.
convergence.termination_reasonstringNoMachine-readable termination label. Common values: "iteration_limit", "bound_stalling".
cuts.total_generatedintegerNoTotal Benders cuts generated across all stages and iterations.
cuts.total_activeintegerNoCuts still active in the pool at termination.
cuts.peak_activeintegerNoMaximum number of simultaneously active cuts at any point during training.
checksumobjectYesIntegrity checksum over policy and convergence files. null in current release (deferred).
mpi_info.world_sizeintegerNoTotal number of MPI ranks. 1 for single-process runs.
mpi_info.ranks_participatedintegerNoNumber of MPI ranks that wrote data.

training/metadata.json

The metadata file captures the configuration snapshot, problem dimensions, performance summary, data integrity hashes, and runtime environment for reproducibility and audit purposes. Fields marked “deferred” are null in the current release and will be populated in a future minor version.

Top-level structure:

{
  "version": "2.0.0",
  "run_info": { ... },
  "configuration_snapshot": { ... },
  "problem_dimensions": { ... },
  "performance_summary": null,
  "data_integrity": null,
  "environment": { ... }
}

run_info fields:

FieldTypeNullableDescription
run_idstringNoUnique run identifier. Placeholder value in current release.
started_atstringYesISO 8601 start timestamp.
completed_atstringYesISO 8601 completion timestamp.
duration_secondsnumberYesTotal run duration in seconds.
cobre_versionstringNoVersion of the cobre binary that produced this output (from CARGO_PKG_VERSION).
solverstringYesLP solver backend identifier (e.g. "highs").
solver_versionstringYesLP solver library version string.
hostnamestringYesPrimary compute node hostname. null in current release.
userstringYesUsername that initiated the run. null in current release.

configuration_snapshot fields:

FieldTypeNullableDescription
seedintegerYesRandom seed used for scenario generation.
forward_passesintegerYesNumber of forward-pass scenario trajectories per iteration.
stopping_modestringNoHow multiple stopping rules combine: "any" or "all".
policy_modestringNoPolicy warm-start mode: "fresh" or "resume".

problem_dimensions fields:

FieldTypeNullableDescription
num_stagesintegerNoNumber of stages in the planning horizon.
num_hydrosintegerNoTotal number of hydro plants.
num_thermalsintegerNoTotal number of thermal plants.
num_busesintegerNoTotal number of buses.
num_linesintegerNoTotal number of transmission lines.

performance_summary: Deferred. Always null in the current release. Will contain total_lp_solves, avg_lp_time_us, median_lp_time_us, p99_lp_time_us, and peak_memory_mb when implemented.

data_integrity: Deferred. Always null in the current release. Will contain SHA-256 hashes of input files, config, policy, and convergence data when implemented.

environment fields:

FieldTypeNullableDescription
mpi_implementationstringYesMPI implementation name (e.g. "OpenMPI"). null in current release.
mpi_versionstringYesMPI library version. null in current release.
num_ranksintegerYesNumber of MPI ranks. null in current release.
cpus_per_rankintegerYesCPU cores per rank. null in current release.
memory_per_rank_gbnumberYesMemory per rank in gigabytes. null in current release.

training/convergence.parquet

Per-iteration convergence log. One row per training iteration. 13 columns.

ColumnTypeNullableDescription
iterationInt32NoTraining iteration number (1-based).
lower_boundFloat64NoBest proven lower bound on the minimum expected cost after this iteration.
upper_bound_meanFloat64NoMean upper bound estimate from the forward-pass scenarios in this iteration.
upper_bound_stdFloat64NoStandard deviation of the upper bound estimate across forward-pass scenarios.
gap_percentFloat64YesRelative gap between lower and upper bounds as a percentage. null when the lower bound is zero or negative.
cuts_addedInt32NoNumber of new cuts added to the pool during this iteration’s backward pass.
cuts_removedInt32NoNumber of cuts deactivated by the cut selection strategy in this iteration.
cuts_activeInt64NoTotal number of active cuts across all stages at the end of this iteration.
time_forward_msInt64NoWall-clock time spent in the forward pass, in milliseconds.
time_backward_msInt64NoWall-clock time spent in the backward pass, in milliseconds.
time_total_msInt64NoTotal wall-clock time for this iteration, in milliseconds.
forward_passesInt32NoNumber of forward-pass scenario trajectories evaluated in this iteration.
lp_solvesInt64NoTotal number of LP solves across all stages and forward passes in this iteration.

training/timing/iterations.parquet

Per-iteration wall-clock timing breakdown by phase. One row per training iteration. 10 columns. All columns are non-nullable.

ColumnTypeNullableDescription
iterationInt32NoTraining iteration number (1-based).
forward_solve_msInt64NoTime spent solving LPs during the forward pass.
forward_sample_msInt64NoTime spent sampling scenarios and computing inflows during the forward pass.
backward_solve_msInt64NoTime spent solving LPs during the backward pass.
backward_cut_msInt64NoTime spent constructing and adding Benders cuts during the backward pass.
cut_selection_msInt64NoTime spent running the cut selection strategy.
mpi_allreduce_msInt64NoTime spent in MPI allreduce operations (cut coefficient aggregation).
mpi_broadcast_msInt64NoTime spent in MPI broadcast operations (cut distribution).
io_write_msInt64NoTime spent writing Parquet and JSON files.
overhead_msInt64NoRemaining wall-clock time not attributed to the above phases.

training/timing/mpi_ranks.parquet

Per-iteration, per-rank timing statistics for distributed runs. One row per (iteration, rank) pair. 8 columns. All columns are non-nullable.

ColumnTypeNullableDescription
iterationInt32NoTraining iteration number (1-based).
rankInt32NoMPI rank index (0-based).
forward_time_msInt64NoWall-clock time this rank spent in the forward pass.
backward_time_msInt64NoWall-clock time this rank spent in the backward pass.
communication_time_msInt64NoWall-clock time this rank spent in MPI communication.
idle_time_msInt64NoWall-clock time this rank was idle (waiting for other ranks).
lp_solvesInt64NoNumber of LP solves performed by this rank in this iteration.
scenarios_processedInt32NoNumber of scenario trajectories processed by this rank.

training/dictionaries/

Five self-documenting files that allow output Parquet files to be interpreted without reference to the original input case. All files are written atomically.

codes.json

Static mapping from integer codes to human-readable labels for all categorical fields used in Parquet output. The same mapping applies for the lifetime of a release (the version field tracks breaking changes).

{
  "version": "1.0",
  "generated_at": "2026-01-17T08:00:00Z",
  "operative_state": {
    "0": "deactivated",
    "1": "maintenance",
    "2": "operating",
    "3": "saturated"
  },
  "storage_binding": {
    "0": "none",
    "1": "below_minimum",
    "2": "above_maximum",
    "3": "both"
  },
  "contract_type": {
    "0": "import",
    "1": "export"
  },
  "entity_type": {
    "0": "hydro",
    "1": "thermal",
    "2": "bus",
    "3": "line",
    "4": "pumping_station",
    "5": "contract",
    "7": "non_controllable"
  },
  "bound_type": {
    "0": "storage_min",
    "1": "storage_max",
    "2": "turbined_min",
    "3": "turbined_max",
    "4": "outflow_min",
    "5": "outflow_max",
    "6": "generation_min",
    "7": "generation_max",
    "8": "flow_min",
    "9": "flow_max"
  }
}

entities.csv

One row per entity across all entity types. Columns:

ColumnDescription
entity_type_codeInteger entity type code (see codes.json entity_type mapping).
entity_idInteger entity ID matching the *_id column in the corresponding simulation Parquet file.
nameHuman-readable entity name from the case input files.
bus_idInteger bus ID to which this entity is connected. For buses, equals entity_id.
system_idSystem partition index. Always 0 in the current release (single-system cases).

Rows are ordered by entity_type_code ascending, then by entity_id ascending within each type.

variables.csv

One row per output column across all Parquet schemas. Documents every column name, its parent schema, and its unit of measure. Useful for building generic result readers that do not hard-code column names.

ColumnDescription
schemaName of the Parquet schema this column belongs to (e.g. "hydros", "costs").
column_nameExact column name as it appears in the Parquet file.
arrow_typeArrow data type string (e.g. "Int32", "Float64", "Boolean").
nullable"true" or "false".
unitPhysical unit or "code" for categorical fields, "boolean" for flag fields, "id" for identifiers, "dimensionless" for pure ratios.
descriptionShort description of the column’s meaning.

bounds.parquet

Per-entity, per-stage resolved LP variable bounds. Documents the actual numerical bounds used in each LP solve, after applying the three-tier penalty resolution (global / entity / stage overrides).

ColumnTypeNullableDescription
entity_type_codeInt8NoEntity type code (see codes.json).
entity_idInt32NoEntity ID.
stage_idInt32NoStage index (0-based).
bound_type_codeInt8NoBound type code (see codes.json bound_type mapping).
lower_boundFloat64NoResolved lower bound value in the bound’s natural unit.
upper_boundFloat64NoResolved upper bound value in the bound’s natural unit.

state_dictionary.json

Describes the state space structure used by the algorithm: which entities have state variables, how many state dimensions they contribute, and what units apply. Useful for interpreting cut coefficient vectors in the policy checkpoint.

{
  "version": "1.0",
  "state_dimension": 164,
  "storage_states": [
    { "hydro_id": 0, "dimension_index": 0, "unit": "hm3" },
    { "hydro_id": 1, "dimension_index": 1, "unit": "hm3" }
  ],
  "inflow_lag_states": [
    { "hydro_id": 0, "lag_index": 1, "dimension_index": 2, "unit": "m3s" }
  ]
}
FieldDescription
state_dimensionTotal number of state variables. Equals the length of each cut’s coefficient vector in the policy checkpoint.
storage_statesOne entry per hydro plant that contributes a reservoir storage state variable.
storage_states[].hydro_idHydro plant ID.
storage_states[].dimension_index0-based index of this state variable in the coefficient vector.
storage_states[].unitPhysical unit: always "hm3" (hectare-metres cubed).
inflow_lag_statesOne entry per (hydro, lag) pair that contributes an inflow lag state variable.
inflow_lag_states[].hydro_idHydro plant ID.
inflow_lag_states[].lag_indexAutoregressive lag order (1-based).
inflow_lag_states[].dimension_index0-based index in the coefficient vector.
inflow_lag_states[].unitPhysical unit: always "m3s" (cubic metres per second).

Policy Checkpoint

policy/cuts/stage_NNN.bin

FlatBuffers binary file encoding all cuts for a single stage. One file per stage; file names are zero-padded to three digits (e.g. stage_000.bin, stage_012.bin).

The binary is not human-readable. The logical record structure for each cut contained in the file is:

FieldTypeDescription
cut_iduint64Unique identifier for this cut across all iterations. Assigned monotonically by the training loop.
slot_indexuint32LP row position. Required for checkpoint reproducibility and basis warm-starting.
iterationuint32Training iteration that generated this cut.
forward_pass_indexuint32Forward pass index within the generating iteration.
interceptfloat64Pre-computed cut intercept: alpha - beta' * x_hat, where x_hat is the state at the generating forward pass node.
coefficientsfloat64[]Gradient coefficient vector. Length equals state_dimension from state_dictionary.json.
is_activeboolWhether this cut is currently active in the LP. Inactive cuts are retained for potential reactivation by the cut selection strategy.
domination_countuint32Cut selection bookkeeping counter. Number of times this cut has been dominated without being selected.

The encoding uses the FlatBuffers runtime builder API (little-endian, no reflection, no generated code). Field order in the binary matches the declaration order above.

policy/basis/stage_NNN.bin

FlatBuffers binary file encoding the LP simplex basis checkpoint for a single stage. One file per stage. Used to warm-start LP solves when resuming a study.

The logical record structure is:

FieldTypeDescription
stage_iduint32Stage index (0-based).
iterationuint32Training iteration that produced this basis.
column_statusuint8[]One status code per LP column (variable). Encoding is HiGHS-specific.
row_statusuint8[]One status code per LP row (constraint). Encoding is HiGHS-specific.
num_cut_rowsuint32Number of trailing rows in row_status that correspond to cut rows (as opposed to structural constraints).

policy/metadata.json

Small JSON file describing the checkpoint at a high level. Human-readable and intended for compatibility checking on study resume.

FieldTypeNullableDescription
versionstringNoCheckpoint schema version.
cobre_versionstringNoVersion of the cobre binary that wrote this checkpoint.
created_atstringNoISO 8601 timestamp when the checkpoint was written.
completed_iterationsintegerNoNumber of training iterations completed at checkpoint time.
final_lower_boundnumberNoLower bound value after the final completed iteration.
best_upper_boundnumberYesBest upper bound observed during training. null when upper bound evaluation was disabled.
state_dimensionintegerNoLength of each cut’s coefficient vector. Must match state_dictionary.json.
num_stagesintegerNoNumber of stages. Must match the case configuration on resume.
config_hashstringNoHash of the algorithm configuration. Checked against the current config on resume.
system_hashstringNoHash of the system data. Checked against the current system on resume.
max_iterationsintegerNoMaximum iterations configured for the run.
forward_passesintegerNoNumber of forward passes per iteration configured for the run.
warm_start_cutsintegerNoNumber of cuts loaded from a previous policy at run start. 0 for fresh runs.
rng_seedintegerNoRNG seed used by the scenario sampler. Required for reproducibility.

Simulation Output

All simulation results use Hive partitioning: one data.parquet file per scenario stored in a scenario_id=NNNN/ subdirectory. See Hive Partitioning below for how to read these files.

simulation/costs/

Stage and block-level cost breakdown. One row per (stage, block) pair. 20 columns.

ColumnTypeNullableDescription
stage_idInt32NoStage index (0-based).
block_idInt32YesLoad block index within the stage. null for stage-level (non-block) records.
total_costFloat64NoTotal discounted cost for this stage/block (monetary units).
immediate_costFloat64NoImmediate (undiscounted) cost for this stage/block.
future_costFloat64NoFuture cost estimate (Benders cut value) at the end of this stage.
discount_factorFloat64NoDiscount factor applied to this stage’s costs.
thermal_costFloat64NoThermal generation cost component.
contract_costFloat64NoEnergy contract cost component (positive for imports, negative for exports).
deficit_costFloat64NoCost of unserved load (deficit penalty).
excess_costFloat64NoCost of excess generation (excess penalty).
storage_violation_costFloat64NoCost of reservoir storage bound violations.
filling_target_costFloat64NoCost of missing reservoir filling targets.
hydro_violation_costFloat64NoCost of hydro operational bound violations.
inflow_penalty_costFloat64NoCost of inflow non-negativity slack (numerical penalty).
generic_violation_costFloat64NoCost of generic constraint violations.
spillage_costFloat64NoCost of reservoir spillage.
fpha_turbined_costFloat64NoTurbined flow penalty from the future-production hydro approximation.
curtailment_costFloat64NoCost of non-controllable source curtailment.
exchange_costFloat64NoTransmission exchange cost component.
pumping_costFloat64NoPumping station energy cost component.

simulation/hydros/

Hydro plant dispatch results. One row per (stage, block, hydro) triplet. 28 columns.

ColumnTypeNullableDescription
stage_idInt32NoStage index (0-based).
block_idInt32YesLoad block index. null for stage-level records.
hydro_idInt32NoHydro plant ID.
turbined_m3sFloat64NoTurbined flow in cubic metres per second (m³/s).
spillage_m3sFloat64NoSpilled flow in m³/s.
outflow_m3sFloat64NoTotal outflow (turbined + spilled) in m³/s.
evaporation_m3sFloat64YesEvaporation loss in m³/s. null if evaporation is not modelled for this plant.
diverted_inflow_m3sFloat64YesDiverted inflow to this reservoir in m³/s. null if no diversion is configured.
diverted_outflow_m3sFloat64YesDiverted outflow from this reservoir in m³/s. null if no diversion is configured.
incremental_inflow_m3sFloat64NoNatural incremental inflow to this reservoir in m³/s (excluding upstream contributions).
inflow_m3sFloat64NoTotal inflow to this reservoir in m³/s (including upstream contributions).
storage_initial_hm3Float64NoReservoir storage at the start of the stage in hectare-metres cubed (hm³).
storage_final_hm3Float64NoReservoir storage at the end of the stage in hm³.
generation_mwFloat64NoAverage power generation over the block in megawatts (MW).
generation_mwhFloat64NoTotal energy generated over the block in megawatt-hours (MWh).
productivity_mw_per_m3sFloat64YesEffective productivity factor in MW/(m³/s). null for fixed-productivity plants when productivity is not stage-varying.
spillage_costFloat64NoMonetary cost attributed to spillage.
water_value_per_hm3Float64NoShadow price of the reservoir water balance constraint (monetary units per hm³).
storage_binding_codeInt8NoWhether the storage bounds were binding (see codes.json storage_binding mapping).
operative_state_codeInt8NoOperative state code (see codes.json operative_state mapping).
turbined_slack_m3sFloat64NoTurbined flow slack variable (non-negativity enforcement). Zero under normal operation.
outflow_slack_below_m3sFloat64NoOutflow lower-bound slack in m³/s.
outflow_slack_above_m3sFloat64NoOutflow upper-bound slack in m³/s.
generation_slack_mwFloat64NoGeneration bound slack in MW.
storage_violation_below_hm3Float64NoReservoir storage below-minimum violation in hm³. Zero under feasible operation.
filling_target_violation_hm3Float64NoFilling target miss in hm³. Zero when the target is met.
evaporation_violation_m3sFloat64NoEvaporation non-negativity violation in m³/s. Zero under normal operation.
inflow_nonnegativity_slack_m3sFloat64NoInflow non-negativity slack in m³/s. Zero under normal operation.

simulation/thermals/

Thermal unit dispatch results. One row per (stage, block, thermal) triplet. 10 columns.

ColumnTypeNullableDescription
stage_idInt32NoStage index (0-based).
block_idInt32YesLoad block index. null for stage-level records.
thermal_idInt32NoThermal unit ID.
generation_mwFloat64NoAverage power generation over the block in MW.
generation_mwhFloat64NoTotal energy generated over the block in MWh.
generation_costFloat64NoMonetary generation cost for this block.
is_gnlBooleanNotrue if this unit operates under GNL (gas natural liquefied) pricing rules.
gnl_committed_mwFloat64YesCommitted capacity under GNL mode in MW. null for non-GNL units.
gnl_decision_mwFloat64YesDispatch decision under GNL mode in MW. null for non-GNL units.
operative_state_codeInt8NoOperative state code (see codes.json operative_state mapping).

simulation/exchanges/

Transmission line flow results. One row per (stage, block, line) triplet. 11 columns.

ColumnTypeNullableDescription
stage_idInt32NoStage index (0-based).
block_idInt32YesLoad block index. null for stage-level records.
line_idInt32NoTransmission line ID.
direct_flow_mwFloat64NoFlow in the forward (direct) direction in MW.
reverse_flow_mwFloat64NoFlow in the reverse direction in MW.
net_flow_mwFloat64NoNet flow (direct minus reverse) in MW.
net_flow_mwhFloat64NoNet energy flow over the block in MWh.
losses_mwFloat64NoTransmission losses in MW.
losses_mwhFloat64NoTransmission losses in MWh over the block.
exchange_costFloat64NoMonetary cost attributed to this line’s exchange.
operative_state_codeInt8NoOperative state code (see codes.json operative_state mapping).

simulation/buses/

Bus load balance results. One row per (stage, block, bus) triplet. 10 columns.

ColumnTypeNullableDescription
stage_idInt32NoStage index (0-based).
block_idInt32YesLoad block index. null for stage-level records.
bus_idInt32NoBus ID.
load_mwFloat64NoTotal load demand at this bus in MW.
load_mwhFloat64NoTotal load energy demand over the block in MWh.
deficit_mwFloat64NoUnserved load (deficit) at this bus in MW. Zero under feasible dispatch.
deficit_mwhFloat64NoUnserved load energy over the block in MWh.
excess_mwFloat64NoExcess generation at this bus in MW. Zero under feasible dispatch.
excess_mwhFloat64NoExcess generation energy over the block in MWh.
spot_priceFloat64NoLocational marginal price (shadow price of the power balance constraint) in monetary units per MWh.

simulation/pumping_stations/

Pumping station results. One row per (stage, block, pumping station) triplet. 9 columns.

ColumnTypeNullableDescription
stage_idInt32NoStage index (0-based).
block_idInt32YesLoad block index. null for stage-level records.
pumping_station_idInt32NoPumping station ID.
pumped_flow_m3sFloat64NoPumped flow rate in m³/s.
pumped_volume_hm3Float64NoTotal pumped volume over the stage in hm³.
power_consumption_mwFloat64NoPower consumed by the pumping station in MW.
energy_consumption_mwhFloat64NoEnergy consumed over the block in MWh.
pumping_costFloat64NoMonetary cost of pumping energy.
operative_state_codeInt8NoOperative state code (see codes.json operative_state mapping).

simulation/contracts/

Energy contract results. One row per (stage, block, contract) triplet. 8 columns.

ColumnTypeNullableDescription
stage_idInt32NoStage index (0-based).
block_idInt32YesLoad block index. null for stage-level records.
contract_idInt32NoContract ID.
power_mwFloat64NoContracted power in MW. Positive for imports, negative for exports.
energy_mwhFloat64NoContracted energy over the block in MWh.
price_per_mwhFloat64NoContract price in monetary units per MWh.
total_costFloat64NoTotal contract cost for this block. Positive for imports.
operative_state_codeInt8NoOperative state code (see codes.json operative_state mapping).

simulation/non_controllables/

Non-controllable source results (wind, solar, run-of-river hydro without storage, etc.). One row per (stage, block, non-controllable) triplet. 10 columns.

ColumnTypeNullableDescription
stage_idInt32NoStage index (0-based).
block_idInt32YesLoad block index. null for stage-level records.
non_controllable_idInt32NoNon-controllable source ID.
generation_mwFloat64NoActual generation dispatched in MW.
generation_mwhFloat64NoActual energy generated over the block in MWh.
available_mwFloat64NoMaximum available generation in MW (before curtailment).
curtailment_mwFloat64NoGeneration curtailed in MW. Zero when all available generation is dispatched.
curtailment_mwhFloat64NoCurtailed energy over the block in MWh.
curtailment_costFloat64NoMonetary cost attributed to curtailment.
operative_state_codeInt8NoOperative state code (see codes.json operative_state mapping).

simulation/inflow_lags/

Autoregressive inflow lag state variables. One row per (stage, hydro, lag) triplet. No block dimension — inflow lags are stage-level state variables. 4 columns. All columns are non-nullable.

ColumnTypeNullableDescription
stage_idInt32NoStage index (0-based).
hydro_idInt32NoHydro plant ID.
lag_indexInt32NoAutoregressive lag order (1-based). Lag 1 is the previous stage’s inflow.
inflow_m3sFloat64NoInflow value for this lag in m³/s.

simulation/violations/generic/

Generic user-defined constraint violations. One row per (stage, block, constraint) triplet where a violation occurred. 5 columns.

ColumnTypeNullableDescription
stage_idInt32NoStage index (0-based).
block_idInt32YesLoad block index. null for stage-level constraints.
constraint_idInt32NoConstraint ID as defined in the case input files.
slack_valueFloat64NoViolation magnitude in the constraint’s natural unit. Zero means no violation.
slack_costFloat64NoMonetary cost attributed to this violation.

Hive Partitioning

All simulation Parquet output uses Hive partitioning: results for each scenario are stored in a directory named scenario_id=NNNN/ containing a single data.parquet file. The scenario_id column is encoded in the directory name, not as a column inside the Parquet file.

All major columnar data tools understand this layout and can read an entire simulation/<entity>/ directory as a single table with an automatically inferred scenario_id column:

# Polars — reads all scenarios at once, infers scenario_id from directory names
import polars as pl
df = pl.read_parquet("results/simulation/costs/")
print(df.head())
# Pandas with PyArrow backend
import pandas as pd
df = pd.read_parquet("results/simulation/costs/")
-- DuckDB — filter to a specific scenario at the storage layer
SELECT * FROM read_parquet('results/simulation/costs/**/*.parquet')
WHERE scenario_id = 0;
# R with the arrow package
library(arrow)
ds <- open_dataset("results/simulation/costs/")
dplyr::collect(dplyr::filter(ds, scenario_id == 0))

Scenario IDs are zero-based integers. The total number of scenarios is documented in simulation/_manifest.json under scenarios.total.


Manifest Files

Both training/_manifest.json and simulation/_manifest.json follow the same write protocol:

  1. Serialize JSON to a temporary .json.tmp sibling file.
  2. Atomically rename the .tmp file to the target path.

This ensures consumers never observe a partial manifest. If a manifest file exists, it contains a complete JSON document. If a run is interrupted before the final manifest write, the .tmp file may remain but the manifest itself will reflect the last successful checkpoint, not a partial write.

The status field is always the first indicator to check:

StatusMeaning
"running"The run is in progress or was interrupted without writing a final status.
"complete"The run finished normally. All output files are present.
"converged"Training terminated because a convergence stopping rule was satisfied. (Training manifest only.)
"failed"The run encountered a terminal error. Output files up to the failure point are present.
"partial"Not all scenarios completed. (Simulation manifest only.)

cobre report reads both manifests and training/metadata.json and prints a combined JSON summary to stdout. Use it in CI pipelines or shell scripts to inspect outcomes without parsing JSON directly:

# Extract the termination reason
cobre report results/ | jq '.training.convergence.termination_reason'

# Fail a CI job if the run did not complete
status=$(cobre report results/ | jq -r '.status')
[ "$status" = "complete" ] || exit 1

Error Codes Reference

cobre-io reports two kinds of errors: LoadError variants (the top-level Result<System, LoadError> returned by load_case) and ErrorKind values (diagnostic categories collected by ValidationContext during the five-layer validation pipeline).

For an explanation of how the validation pipeline works and when each error phase runs, see cobre-io.


LoadError variants

LoadError is the top-level error type returned by load_case and by every individual file parser. There are 6 variants, ordered by the pipeline phase in which they typically occur.

IoError

When it occurs: A required file exists in the file manifest but cannot be read from disk — file not found, permission denied, or other OS-level I/O failure. Occurs in Layer 1 (structural) or Layer 2 (schema) when std::fs::read_to_string or a Parquet reader returns an error.

Display format:

I/O error reading {path}: {source}

Fields:

FieldTypeDescription
pathPathBufPath to the file that could not be read
sourcestd::io::ErrorUnderlying OS I/O error

Example:

I/O error reading system/hydros.json: No such file or directory (os error 2)

Resolution: Verify the file exists in the case directory. Check that the process has read permissions for the directory and file. For load_case, the case root must contain all 8 required files (see Case Format).


ParseError

When it occurs: A file is readable but its content is malformed — invalid JSON syntax, unexpected end of input, or an unreadable Parquet column header. Occurs in Layer 2 (schema) during initial deserialization before any field-level validation runs.

Display format:

parse error in {path}: {message}

Fields:

FieldTypeDescription
pathPathBufPath to the file that failed to parse
messageStringHuman-readable description of the parse failure

Example:

parse error in stages.json: expected `:` at line 5 column 12

Resolution: Open the file in a JSON validator or Parquet viewer. The message contains the location of the syntax error. For JSON files, a trailing comma, missing closing brace, or unquoted key are common causes.


SchemaError

When it occurs: A file parses successfully but a field violates a schema constraint: a required field is missing, a value is outside its valid range, or an enum discriminator names an unknown variant. Occurs in Layer 2 (schema) during post-deserialization validation. Also returned by parse_config when training.forward_passes or training.stopping_rules is absent.

Display format:

schema error in {path}, field {field}: {message}

Fields:

FieldTypeDescription
pathPathBufPath to the file containing the invalid entry
fieldStringDot-separated path to the offending field (e.g., "hydros[3].bus_id")
messageStringHuman-readable description of the violation

Example:

schema error in config.json, field training.forward_passes: required field is missing
schema error in system/buses.json, field buses[1].id: duplicate id 5 in buses array

Resolution: The field value identifies the exact location of the problem. Check that required fields are present and that values fall within documented ranges. For config.json, training.forward_passes and training.stopping_rules are mandatory and have no defaults.


CrossReferenceError

When it occurs: An entity ID field references an entity that does not exist in the expected registry. Occurs in Layer 3 (referential integrity). All broken references across all entity types are collected before returning.

Display format:

cross-reference error: {source_entity} in {source_file} references
non-existent {target_entity} in {target_collection}

Fields:

FieldTypeDescription
source_filePathBufPath to the file that contains the dangling reference
source_entityStringString identifier of the entity that holds the broken reference (e.g., "Hydro 'H1'")
target_collectionStringName of the registry that was expected to contain the target (e.g., "bus registry")
target_entityStringString identifier of the entity that could not be found (e.g., "BUS_99")

Example:

cross-reference error: Hydro 'FURNAS' in system/hydros.json references
non-existent BUS_99 in bus registry

Resolution: The target_entity does not exist in the target_collection. Either add the missing entity to its registry file, or correct the ID reference in source_file. Common causes: a bus was deleted from system/buses.json but a hydro, thermal, or line still references its old ID.


ConstraintError

When it occurs: A catch-all for all validation diagnostics collected by ValidationContext across any of the five layers, and for SystemBuilder::build() rejections. The description field contains every collected error message joined by newlines, each prefixed with its [ErrorKind], source file, optional entity identifier, and message text.

Display format:

constraint violation: {description}

Fields:

FieldTypeDescription
descriptionStringAll error messages joined by newlines

Example:

constraint violation: [FileNotFound] system/hydros.json: required file 'system/hydros.json' not found in case directory
[SchemaViolation] system/buses.json (bus_42): missing field bus_id

Resolution: Read every line in description — each line is a separate problem. Address them all and re-run. The [ErrorKind] prefix identifies the category of each problem; see the ErrorKind catalog below for resolution guidance per category.


PolicyIncompatible

When it occurs: After all five validation layers pass, when policy.mode is "warm_start" or "resume" and the stored policy file is structurally incompatible with the current case. The four compatibility checks are: hydro count, stage count, cut dimension, and entity identity hash.

Display format:

policy incompatible: {check} mismatch — policy has {policy_value}, system has {system_value}

Fields:

FieldTypeDescription
checkStringName of the failing compatibility check (e.g., "hydro count")
policy_valueStringValue recorded in the policy file
system_valueStringValue present in the current system

Example:

policy incompatible: hydro count mismatch — policy has 42, system has 43

Resolution: The stored policy was produced by a run with a different system configuration. Options:

  • Set policy.mode to "fresh" to start from scratch without loading the policy.
  • Revert the system change that caused the mismatch.
  • Delete the policy directory and start fresh.

ErrorKind values

ErrorKind categorises the validation problem within the ValidationContext diagnostic system. Every ValidationEntry carries one ErrorKind. When ValidationContext::into_result() produces a ConstraintError, each line in description is prefixed with the ErrorKind in debug format (e.g., [FileNotFound]).

There are 14 ErrorKind values. Two (UnusedEntity and ModelQuality) default to Severity::Warning — they are reported but do not block execution. All others default to Severity::Error and must be resolved before load_case succeeds.

FileNotFound

Default severity: Error

What triggers it: A file that is required by the case structure is missing from the case directory. Emitted by Layer 1 (structural validation) for each of the 8 required files that is not found on disk.

Example message: required file 'system/hydros.json' not found in case directory

Resolution: Create the missing file in the correct subdirectory. The 8 required files are: config.json, penalties.json, stages.json, initial_conditions.json, system/buses.json, system/lines.json, system/hydros.json, and system/thermals.json.


ParseError

Default severity: Error

What triggers it: A file exists and was read but could not be parsed — invalid JSON syntax, an unreadable Parquet header, or an unknown enum variant in a tagged JSON union. Emitted by Layer 2 (schema validation) when the initial deserialization of a file fails.

Example message: parse error in stages.json: expected : at line 5 column 12

Resolution: Fix the syntax error in the indicated file. Use a JSON linter or Parquet viewer to find the exact location. For JSON files, common causes are trailing commas, missing quotation marks, or mismatched braces.


SchemaViolation

Default severity: Error

What triggers it: A file parses successfully but a field fails a schema constraint: a required field is missing, a value is outside its valid range (e.g., negative capacity, non-positive penalty cost), or a field contains an unexpected type. Emitted by Layer 2 (schema validation) during post-deserialization validation.

Example message: schema error in system/buses.json, field buses[2].deficit_segments[0].cost: penalty value must be > 0.0, got -100.0

Resolution: Correct the value in the indicated field. Field paths use dot-notation and zero-based array indices. Consult the Case Format page for valid ranges and required fields.


InvalidReference

Default severity: Error

What triggers it: A cross-entity foreign-key reference points to an entity that does not exist in the expected registry. For example, a hydro plant’s bus_id references a bus that is not in system/buses.json. Emitted by Layer 3 (referential integrity).

Example message: Hydro 'FURNAS' references non-existent bus BUS_99 in bus registry

Resolution: Either add the referenced entity to its registry file, or correct the ID in the referencing file. Check all ID references: hydros.bus_id, thermals.bus_id, lines.source_bus_id, lines.target_bus_id, hydros.downstream_id.


DuplicateId

Default severity: Error

What triggers it: Two entities within the same registry share the same ID. IDs must be unique within each entity type. Emitted by Layer 2 (schema validation) when duplicate IDs are detected within a single file.

Example message: duplicate id 5 in buses array

Resolution: Assign a unique ID to each entity. IDs are integers; use any non-negative value as long as each is unique within its registry file.


InvalidValue

Default severity: Error

What triggers it: A field value falls outside its valid range or violates a value constraint that is specific to the field’s domain. Examples: a reservoir’s min_storage_hm3 exceeds max_storage_hm3, or a stage has num_scenarios: 0. Emitted by Layer 2 (schema validation).

Example message: min_storage_hm3 (8000.0) must be <= max_storage_hm3 (5000.0)

Resolution: Correct the field value to be within the valid range. Consult the Case Format page for documented constraints. For storage bounds, ensure min <= max. For scenario counts, ensure num_scenarios >= 1.


CycleDetected

Default severity: Error

What triggers it: A directed graph contains a cycle. The primary case is the hydro cascade: the downstream_id links among hydro plants must form a directed forest (no cycles). A cycle would mean plant A drains into plant B which drains back into plant A. Detected by topological sort in Layer 5 (semantic validation).

Example message: hydro cascade contains a cycle involving plants: [H1, H2, H3]

Resolution: Review the downstream_id chain for the listed plants and remove the cycle. Every hydro cascade must be a directed tree rooted at plants with no downstream (tailwater discharge).


DimensionMismatch

Default severity: Error

What triggers it: A cross-file coverage check fails. For example, when scenarios/inflow_seasonal_stats.parquet is present, every hydro plant must have at least one row of statistics. A mismatch means an optional per-entity file provides data for some entities but not all that require it. Emitted by Layer 4 (dimensional consistency).

Example message: hydro 'ITAIPU' has no inflow seasonal statistics

Resolution: Add the missing rows to the Parquet file. Every hydro plant that is active during the study must appear in inflow_seasonal_stats.parquet when that file is present.


BusinessRuleViolation

Default severity: Error

What triggers it: A domain-specific business rule is violated that cannot be expressed as a simple range constraint. Examples: penalty tiers must be monotonically ordered (lower-tier penalties may not exceed upper-tier penalties for the same entity), PAR model stationarity requirements are violated, or stage count is inconsistent across files. Emitted by Layer 5 (semantic validation).

Example message: penalty tier ordering violated for hydro 'FURNAS': spillage_cost (500.0) exceeds storage_violation_below_cost (100.0)

Resolution: Read the message carefully — it describes the specific rule that was violated and which entities are involved. For penalty ordering, ensure that costs increase from lower-priority to higher-priority tiers. For stationarity, verify that the PAR model parameters satisfy the required statistical properties.


WarmStartIncompatible

Default severity: Error

What triggers it: A warm-start policy is structurally incompatible with the current system. The four compatibility checks are: hydro count, stage count, cut dimension, and entity identity hash. The policy was produced by a run with a different system configuration. This ErrorKind is the ValidationContext counterpart to the LoadError::PolicyIncompatible variant.

Example message: warm-start policy has 42 hydros but current system has 43

Resolution: See PolicyIncompatible under LoadError above.


ResumeIncompatible

Default severity: Error

What triggers it: A resume state (checkpoint) is incompatible with the current run configuration. The checkpoint may have been produced by a run with a different config.json or a different system, making it impossible to resume from that state consistently.

Example message: resume checkpoint iteration 150 is beyond current iteration_limit 100

Resolution: Either adjust config.json to be consistent with the checkpoint (e.g., increase the iteration limit), or set policy.mode to "fresh" to discard the checkpoint and start a new run.


NotImplemented

Default severity: Error

What triggers it: A feature referenced in the input files is recognized by the schema but not yet implemented in the current version of Cobre. This is used during development to surface unimplemented feature requests from valid input.

Example message: hydro production model 'fpha' is not yet implemented

Resolution: Avoid using the unimplemented feature until it is available. Check the project roadmap for the implementation timeline. Alternatively, use the currently supported alternatives (e.g., "constant_productivity" instead of "fpha" for hydro generation models).


UnusedEntity

Default severity: Warning (does not block execution)

What triggers it: An entity is defined in a registry file but appears to be inactive — for example, a thermal plant with max_generation_mw: 0.0 for all stages. The entity is valid but contributes nothing to the model. Reported as a warning to alert the user to possible input errors or unintentional inclusions.

Example message: thermal 'OLD_PLANT' has max_generation_mw = 0.0 and will contribute no generation

Resolution: Either remove the entity from the registry file or set a non-zero generation capacity if the omission was accidental. If the entity is intentionally inactive, this warning can be ignored.


ModelQuality

Default severity: Warning (does not block execution)

What triggers it: A statistical quality concern is detected in the input model. Examples: residual bias in the PAR model seasonal statistics, high autocorrelation residuals, or an AR order that is suspiciously large for the data. These do not prevent execution but may indicate that the model needs recalibration.

Example message: residual bias detected in inflow_seasonal_stats for hydro 'FURNAS' at stage 0: mean residual 45.2 m3/s

Resolution: Review the flagged model parameters. Consider recalibrating the PAR model for the affected hydro plants. Warnings of this type do not prevent the solver from running, but they may indicate that the stochastic model does not accurately represent historical inflows.


Severity reference

SeverityEffectErrorKind values
ErrorPrevents load_case from succeedingAll kinds except UnusedEntity and ModelQuality
WarningReported but does not block executionUnusedEntity, ModelQuality

To inspect warnings after a successful load_case, call ValidationContext::warnings() before calling into_result(). Warnings are not surfaced in the Result returned by load_case; they must be read from the context directly.

Roadmap

The Cobre post-v0.1.0 roadmap documents deferred features, HPC optimizations, and post-MVP crates planned for future releases. It covers implementation-level work items only; for methodology-level roadmap pages (algorithm theory, spec evolution), see the cobre-docs roadmap.

The roadmap is maintained in the repository at docs/ROADMAP.md.

Sections

The roadmap is organized into four areas:

  • Inflow Truncation Methods – Two additional non-negativity treatment methods (Truncation and TruncationWithPenalty) deferred from v0.1.0.
  • HPC Optimizations – Performance improvements beyond the rayon baseline, grouped into near-term (v0.1.x/v0.2.x) and longer-term (v0.3+) items.
  • Post-MVP Crates – Implementation plans for the three stubbed workspace crates: cobre-mcp, cobre-python, and cobre-tui.
  • Algorithm Extensions – Deferred solver variants: CVaR risk measure, multi-cut formulation, and infinite-horizon policy graphs.

JSON Schemas

The following JSON Schema files describe the structure of each JSON input file in a Cobre case directory. Download them and point your editor’s JSON Schema validation setting at the appropriate file to get autocompletion, hover documentation, and inline error highlighting while authoring case inputs.

For a complete description of each file’s fields and validation rules, see the Case Directory Format reference page.

Available schemas

Schema fileInput fileDescription
config.schema.jsonconfig.jsonStudy configuration: training parameters, stopping rules, cut selection, simulation settings, and export flags
penalties.schema.jsonpenalties.jsonGlobal penalty cost defaults for bus deficit, line exchange, hydro violations, and non-controllable source curtailment
stages.schema.jsonstages.jsonTemporal structure of the study: stage sequence, load blocks, policy graph horizon, and scenario source configuration
buses.schema.jsonsystem/buses.jsonElectrical bus registry: bus identifiers, names, and optional entity-level deficit cost tiers
lines.schema.jsonsystem/lines.jsonTransmission line registry: line identifiers, source/target buses, and directional MW capacity bounds
hydros.schema.jsonsystem/hydros.jsonHydro plant registry: reservoir bounds, outflow limits, generation model parameters, and cascade linkage
thermals.schema.jsonsystem/thermals.jsonThermal plant registry: generation bounds and linear cost coefficients
energy_contracts.schema.jsonsystem/energy_contracts.jsonBilateral energy contract registry (optional entities)
non_controllable_sources.schema.jsonsystem/non_controllable_sources.jsonIntermittent (non-dispatchable) generation source registry (optional entities)
pumping_stations.schema.jsonsystem/pumping_stations.jsonPumping station registry (optional entities)

Using schemas in your editor

VS Code

Add a json.schemas entry to your workspace .vscode/settings.json:

{
  "json.schemas": [
    {
      "fileMatch": ["config.json"],
      "url": "https://cobre.dev/schemas/v2/config.schema.json"
    },
    {
      "fileMatch": ["system/hydros.json"],
      "url": "https://cobre.dev/schemas/v2/hydros.schema.json"
    }
  ]
}

Alternatively, add a $schema key directly inside each JSON file:

{
  "$schema": "https://cobre.dev/schemas/v2/config.schema.json",
  "training": {
    "forward_passes": 192,
    "stopping_rules": [{ "type": "iteration_limit", "limit": 200 }]
  }
}

Neovim (via jsonls)

Configure json.schemas in your nvim-lspconfig setup for jsonls following the same URL pattern shown above.

JetBrains IDEs

Go to Preferences > Languages & Frameworks > Schemas and DTDs > JSON Schema Mappings, add a new mapping, paste the schema URL, and select the file pattern.

Regenerating schemas

The schema files in book/src/schemas/ are generated from the Rust type definitions in cobre-io. To regenerate them after modifying the input types, run:

cargo run -p cobre-cli -- schema export --output-dir book/src/schemas/

Contributing

See the CONTRIBUTING.md file in the repository root for complete guidelines on:

  • Prerequisites and building
  • Reporting bugs and suggesting features
  • Submitting code (branching, commit messages, CI checks)
  • Coding guidelines (per-crate rules, testing, dependencies)
  • Domain knowledge resources