Understanding Results
After cobre run completes, the output directory contains three categories of
artifacts: training convergence data, a saved policy checkpoint, and simulation
dispatch results. This page explains how to read each category and how to query
the results programmatically using cobre report.
If you have not yet run the quickstart, complete Quickstart
first — this page references the my_first_study/results/ directory produced
by that walkthrough.
The Post-Run Summary
When cobre run finishes, it prints a summary block to stderr. The 1dtoy run
from the quickstart produces output similar to:
Training complete in 3.2s (128 iterations, iteration_limit)
Lower bound: 142.3 $/stage
Upper bound: 143.1 +/- 1.2 $/stage
Gap: 0.6%
Cuts: 384 active / 387 generated
LP solves: 512
Simulation complete (100 scenarios)
Completed: 100 Failed: 0
Output written to my_first_study/results/
Exact numerical values vary across runs because scenario sampling is stochastic. The values below are representative of the 1dtoy example; your run will differ slightly.
| Line | What it means |
|---|---|
Training complete in 3.2s (128 iterations, iteration_limit) | Training ran for 128 iterations (the limit set in config.json) and stopped because the iteration limit was reached, not because a convergence criterion was met. |
Lower bound: 142.3 $/stage | The optimizer’s best proven lower bound on the minimum expected cost per stage. As training progresses this value rises and stabilizes. |
Upper bound: 143.1 +/- 1.2 $/stage | A statistical estimate of the true expected cost, computed from the forward-pass scenarios in the final iteration. The +/- 1.2 is the standard deviation across those scenarios. |
Gap: 0.6% | The relative distance between the lower and upper bounds expressed as a percentage. A gap of 0.6% means the policy cost is within 0.6% of the best possible. Smaller is better. |
Cuts: 384 active / 387 generated | The total number of optimality cuts in the policy pool. 384 are currently active; 3 were deactivated by the cut selection strategy. |
LP solves: 512 | Total number of linear programs solved across all stages and iterations. |
Simulation complete (100 scenarios) | The post-training simulation evaluated the trained policy over 100 independently sampled scenarios. |
Completed: 100 Failed: 0 | All 100 scenarios completed without solver errors. |
Output written to my_first_study/results/ | Root path of the output directory. |
Lower bound vs. upper bound. The lower bound is the optimizer’s proven best estimate of the minimum achievable cost. The upper bound is the average cost observed when running the current policy over sampled scenarios. When the gap is small, the policy is near-optimal. When the gap is large, running more iterations will typically narrow it further.
Termination reasons. The parenthetical after the iteration count explains why training stopped:
iteration_limit— the maximum iteration count was reached (the 1dtoy default).converged at iter N— a convergence criterion was met at iteration N and training stopped early. This appears when you configure abound_stallingor similar rule inconfig.json.
Theory reference: For the mathematical definition of lower and upper bounds, optimality gap, and stopping criteria, see Convergence in the methodology reference.
Output Directory Structure
All artifacts are written under the results directory you specified with --output.
The 1dtoy run produces:
my_first_study/results/
training/
_manifest.json Completion manifest: status, iteration count, convergence, cut stats
metadata.json Run metadata: configuration snapshot, problem dimensions
convergence.parquet Per-iteration convergence metrics (lower bound, upper bound, gap)
dictionaries/
codes.json Integer-to-string code mappings for entity categories
state_dictionary.json State variable definitions and units
entities.csv Entity registry (id, name, type)
variables.csv LP variable registry
bounds.parquet LP variable bound definitions
timing/
iterations.parquet Per-iteration wall-clock timing broken down by phase
policy/
cuts/
stage_000.bin FlatBuffers-encoded optimality cuts for stage 0
stage_001.bin ... stage 1
stage_002.bin ... stage 2
stage_003.bin ... stage 3
basis/
stage_000.bin LP basis checkpoints for warm-starting
stage_001.bin
stage_002.bin
stage_003.bin
metadata.json Policy metadata: stage count, cut counts per stage
simulation/
_manifest.json Completion manifest: scenario counts
buses/
scenario_id=0000/data.parquet
scenario_id=0001/data.parquet
... One partition per scenario
costs/
scenario_id=0000/data.parquet
...
hydros/
scenario_id=0000/data.parquet
...
thermals/
scenario_id=0000/data.parquet
...
inflow_lags/ Inflow lag state data used to initialize scenario chains
The three top-level subdirectories have distinct roles:
training/— everything produced during the training loop: convergence history, timing, and the dictionaries needed to interpret LP variable indices.policy/— the trained policy checkpoint. These binary files encode the optimality cuts built during training. They can be used to resume or extend a study.simulation/— the dispatch results from evaluating the trained policy over 100 simulation scenarios.
Training Results
Reading training/_manifest.json
The training manifest is the canonical summary of what happened during training. The 1dtoy run produces:
{
"version": "2.0.0",
"status": "complete",
"started_at": null,
"completed_at": null,
"iterations": {
"max_iterations": null,
"completed": 128,
"converged_at": null
},
"convergence": {
"achieved": false,
"final_gap_percent": 0.0,
"termination_reason": "iteration_limit"
},
"cuts": {
"total_generated": 387,
"total_active": 384,
"peak_active": 384
},
"checksum": null,
"mpi_info": {
"world_size": 1,
"ranks_participated": 1
}
}
Field-by-field explanation:
| Field | Meaning |
|---|---|
status | "complete" when the training run finished normally. "failed" if a solver error aborted it. |
iterations.completed | Number of training iterations that were executed. |
iterations.converged_at | If training stopped early due to a convergence criterion, the iteration number where it stopped. null for an iteration-limit stop. |
convergence.achieved | true if a convergence stopping rule was satisfied, false if the iteration limit was reached first. |
convergence.final_gap_percent | The gap between lower and upper bounds at the end of training, as a percentage. A value of 0.0 here reflects that the 1dtoy case converged very tightly within its 128-iteration budget. |
convergence.termination_reason | Machine-readable reason for stopping. Common values: "iteration_limit", "bound_stalling". |
cuts.total_generated | Total optimality cuts created across all stages over the entire training run. |
cuts.total_active | Cuts still active in the pool at the end of training (not deactivated by the cut selection strategy). |
cuts.peak_active | Maximum number of active cuts at any point during training. |
mpi_info.world_size | Number of MPI ranks involved in the run. 1 for single-process runs. |
What “converged” means in practice. A converged run (convergence.achieved: true) means a stopping rule determined that continuing would not meaningfully
improve the policy. For the 1dtoy case, the gap reaches near zero within the
128-iteration budget even without an explicit convergence rule, which is why
final_gap_percent is 0.0 despite achieved being false — the run hit
its iteration limit at a point where the policy was already very tight.
For larger studies, configure a bound_stalling or gap_threshold stopping
rule in config.json to stop automatically when the gap stabilizes, rather
than running a fixed number of iterations.
Simulation Results
Hive-Partitioned Layout
The simulation output uses Hive partitioning: results are split into one
data.parquet file per scenario, stored in a directory named
scenario_id=NNNN/. This layout is natively understood by Polars, Pandas
(via PyArrow), R’s arrow package, and DuckDB — they can read the entire
simulation/costs/ directory as a single table and filter by scenario_id
at the storage layer without loading all data into memory.
The four entity categories are:
| Directory | Contents |
|---|---|
buses/ | Power balance results: load, generation injections, deficit, and excess at each bus per stage and block. |
hydros/ | Hydro dispatch: turbined flow, spillage, reservoir storage levels, inflows, and generation per plant per stage and block. |
thermals/ | Thermal dispatch: generation output per unit per cost segment per stage and block. |
costs/ | Objective cost breakdown: total cost, thermal cost, hydro cost, penalty cost, and discount factor per stage. |
Results are in Parquet format. To read them, use any columnar data tool:
# Polars — reads all 100 scenarios at once
import polars as pl
df = pl.read_parquet("my_first_study/results/simulation/costs/")
print(df.head())
# Pandas + PyArrow
import pandas as pd
df = pd.read_parquet("my_first_study/results/simulation/costs/")
print(df.head())
-- DuckDB — filter to a single scenario
SELECT * FROM read_parquet('my_first_study/results/simulation/costs/**/*.parquet')
WHERE scenario_id = 0;
# R with arrow
library(arrow)
ds <- open_dataset("my_first_study/results/simulation/costs/")
dplyr::collect(dplyr::filter(ds, scenario_id == 0))
Querying Results with cobre report
cobre report reads the JSON manifests and prints a structured JSON summary to
stdout. Use it with jq to extract specific metrics in scripts or CI pipelines.
# Print the full report
cobre report my_first_study/results
The output has this top-level shape:
{
"output_directory": "/abs/path/to/results",
"status": "complete",
"training": { "iterations": {}, "convergence": {}, "cuts": {} },
"simulation": { "scenarios": {} },
"metadata": { "run_info": {}, "configuration_snapshot": {} }
}
Practical jq queries
# Extract the final convergence gap
cobre report my_first_study/results | jq '.training.convergence.final_gap_percent'
# Check how many iterations ran
cobre report my_first_study/results | jq '.training.iterations.completed'
# Check simulation scenario counts
cobre report my_first_study/results | jq '.simulation.scenarios'
# Use the status in a CI script: exit non-zero if training failed
status=$(cobre report my_first_study/results | jq -r '.status')
if [ "$status" != "complete" ]; then
echo "Run did not complete successfully: $status" >&2
exit 1
fi
# Check convergence was achieved (returns true or false)
cobre report my_first_study/results | jq '.training.convergence.achieved'
For the complete cobre report documentation and all available JSON fields,
see CLI Reference.
For a detailed description of every field in every output file, see Output Format Reference.
What’s Next
You have now seen how to run a study and interpret its output. The next page collects pointers to everything you need to go further:
- Next Steps — further reading, guides, and community links
- CLI Reference — all flags, subcommands, and exit codes
- Configuration — every
config.jsonfield documented