Data Model

This section specifies every data structure that flows into, through, and out of the Cobre solver. On the input side, that means the JSON registries that define physical entities (hydro plants, thermal units, buses, transmission lines), the Parquet tables that carry stage-varying bounds and stochastic parameters, the penalty cascade that guarantees LP feasibility, and the config.json file that controls all solver behavior. On the output side, it covers the schemas for training convergence reports, simulation result tables, MPI-partitioned file manifests, and the binary persistence formats used for warm-starting and resuming interrupted runs.

The specs are organized around the data lifecycle. Input specs describe what the user provides and how the solver validates and loads it. Internal structure specs describe the in-memory model that the solver builds from those inputs. Output specs describe what the solver produces and how distributed results are assembled. Binary format specs describe the FlatBuffers schemas used for zero-copy serialization of policy data (cuts, states, basis vectors). Together, these 10 specs fully define the contract between the user, the I/O layer (cobre-io), the data model library (cobre-core), and the MPI communication layer (ferrompi).

Every file path, column name, JSON key, and Parquet schema documented here is normative. The solver rejects inputs that do not conform to these specs, and the output schemas are guaranteed stable within a major version. Where design choices exist (e.g., JSON vs. Parquet for a given file, sparse vs. dense override tables), the rationale is stated inline using the format selection criteria from the Design Principles spec.

Reading Order

The specs have extensive cross-references, so reading order matters. The following sequence builds concepts from the filesystem inward:

Input Directory Structure – Start here. Defines the case directory layout, the config.json schema, and the penalty summary that motivates the dedicated penalty spec.
Input System Entities – The seven entity registries in system/: buses, lines, hydros, thermals, non-controllable sources, pumping stations, and energy contracts. Required context for reading any other input spec.
Input Hydro Extensions – Hydro-specific extension files: geometry curves, stage-varying production model configuration, and precomputed FPHA hyperplanes.
Input Scenarios – The stages.json schema (seasons, policy graph, blocks, risk measures), inflow and load models, correlation profiles, and external scenario injection.
Input Constraints – Initial conditions, stage-varying entity bounds, exchange capacity factors, and generic linear constraints.
Penalty System – The three-tier penalty cascade (global defaults, entity overrides, stage overrides), the full penalty inventory, and resolution semantics that guarantee LP feasibility.
Internal Structures – The in-memory solver model built from input files: indexed entity tables, assembled LP matrices, and the runtime state that the forward and backward passes operate on.
Output Schemas – Training convergence tables, simulation result Parquet files, and per-iteration diagnostic exports.
Output Infrastructure – File manifests, MPI rank-partitioned output assembly, streaming vs. buffered write modes, and compression settings.
Binary Formats – FlatBuffers schemas for policy persistence: cut coefficients, visited states, inner approximation vertices, and solver basis snapshots.

Spec Index

Spec	Description	Math Reference
Input Directory Structure	Case directory layout, `config.json` schema, penalty summary	–
Input System Entities	Seven entity registries: buses, lines, hydros, thermals, NCS, pumping, contracts	System Elements
Input Hydro Extensions	Hydro geometry curves, production model config, FPHA hyperplanes	Hydro Production Models
Input Scenarios	Stage definitions, policy graph, inflow/load models, correlations, external scenarios	PAR Inflow Model, Risk Measures
Input Constraints	Initial conditions, stage-varying bounds, exchange factors, generic constraints	LP Formulation
Penalty System	Three-tier penalty cascade, full inventory, resolution semantics	LP Formulation
Internal Structures	In-memory solver model: indexed tables, LP matrices, runtime state	LP Formulation, SDDP Algorithm
Output Schemas	Training convergence, simulation results, diagnostic exports	Upper Bound Evaluation, Stopping Rules
Output Infrastructure	File manifests, MPI-partitioned assembly, streaming writes, compression	–
Binary Formats	FlatBuffers schemas for cuts, states, vertices, basis persistence	Cut Management

Conventions

All specs in this section follow the format selection criteria and declaration order invariance principle defined in Design Principles. In brief: JSON is used for human-editable structured objects with nested or optional fields (entity registries, configuration, correlation profiles); Parquet is used for typed columnar tabular data (stage-varying overrides, time series, scenario parameters); FlatBuffers is used for binary persistence where zero-copy deserialization matters (policy data). Declaration order within JSON arrays and Parquet rows never affects solver behavior – entities are identified by their id field, not by position.

The Notation Conventions spec establishes the index sets, symbols, and unit conventions used throughout. Column names in Parquet schemas and JSON keys are chosen to match the mathematical notation where practical (e.g., volume for \(v*{i,t}\), turbined for \(u*{i,t}\)).

Keyboard shortcuts

Cobre Methodology Reference

Data Model

Reading Order

Spec Index

Conventions