Cost Estimation Mapping Across PostgreSQL and MySQL

This stage of the baseline pipeline translates the incompatible cost vocabularies of PostgreSQL and MySQL into a single dimensionless unit so every downstream comparison operates on mathematically aligned numbers rather than raw, engine-biased optimizer output.

PostgreSQL reports costs in abstract planner units derived from seq_page_cost, cpu_tuple_cost, and related GUCs, while MySQL’s optimizer expresses cost as a composite of row estimates, I/O operations, and CPU cycles emitted under cost_info in EXPLAIN FORMAT=JSON. A total_cost of 4200 in PostgreSQL and a query_cost of 4200 in MySQL describe entirely different physical work. The Cost Normalization and Calibration stage exists to bridge that semantic gap: it performs deterministic translation of engine-specific cost vectors into a comparable metric space, operating strictly downstream of plan capture and upstream of structural hashing. It is a component of the Core Architecture & Baselining Fundamentals reference architecture, and it is the layer that lets regression detection reason about drift without engine-specific branching.

Architectural boundaries

Strict isolation is mandatory for deterministic baselining. This stage consumes normalized plan trees produced upstream by the capture layer — specifically the output of normalizing query plans for cross-engine comparison — and emits dimensionless cost vectors that feed the plan hashing algorithms for SQL engines module. It never computes hashes, never evaluates thresholds, and never triggers CI/CD gates; those responsibilities belong to distinct stages so that a fault in cost math cannot corrupt fingerprinting or gate verdicts.

The contract at each boundary is explicit:

Ingress: structured JSON payloads carrying an EXPLAIN/EXPLAIN ANALYZE tree, an engine identifier, an engine version, and a schema version. Payloads missing total_cost/rows fields, or lacking version metadata, are rejected synchronously before any math runs.
Processing: engine-specific normalization functions execute in a pure, stateless context. Side effects (metrics, traces) are decoupled through async emission so the hot path stays deterministic.
Egress: a standardized cost vector routes to the hashing queue. On failure the payload routes to a dead-letter queue (DLQ) with an explicit error code — ERR_MISSING_STATS, ERR_COST_OVERFLOW, or ERR_VERSION_DRIFT — never a silent fallback to stale coefficients.

Because the emitted unit is engine-agnostic, identical logical plans executed on PostgreSQL and MySQL produce comparable baseline signatures, and the downstream regression threshold logic can apply uniform multipliers without knowing which engine produced a given plan.

Deterministic routing and schema enforcement

Every payload is validated against a strict field contract before it is admitted. The canonical JSON Schema pins the accepted engines, forbids unknown fields, and requires the version metadata that calibration lookups depend on:

JSON

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "title": "CostNormalizationIngress",
  "type": "object",
  "additionalProperties": false,
  "required": ["engine", "engine_version", "schema_version", "total_cost", "estimated_rows"],
  "properties": {
    "engine": { "enum": ["postgresql", "mysql"] },
    "engine_version": { "type": "string", "pattern": "^\\d+\\.\\d+(\\.\\d+)?$" },
    "schema_version": { "type": "string", "pattern": "^v\\d+$" },
    "total_cost": { "type": "number", "exclusiveMinimum": 0 },
    "estimated_rows": { "type": "integer", "minimum": 0 },
    "startup_cost": { "type": "number", "minimum": 0 }
  }
}

Routing is deterministic and keyed so that all plans sharing a calibration profile land on the same partition. The partition key is a composite of engine and the major.minor version, hashed into a fixed ring:

partition = crc32(f"{engine}:{major}.{minor}") % PARTITION_COUNT

Pinning on major.minor — not the full patch string — keeps a plan on a stable partition across patch upgrades (16.2 → 16.3) while still isolating a genuine cost-model change at a minor bump (16.x → 17.x). Calibration coefficients are versioned by the same key and pulled from a signed configuration store at worker initialization, so a coefficient set can never drift silently underneath a running plan.

The normalization itself maps both engines onto a single dimensionless baseline unit (DBU). PostgreSQL uses a linear model where total_cost = startup_cost + run_cost, scaled by hardware-derived constants; MySQL uses a row-driven model dominated by index selectivity and join buffering, with the scalar sourced from query_block.cost_info.query_cost:

PostgreSQL: DBU = (total_cost / baseline_seq_cost) * (1 + cpu_penalty_factor)
MySQL: DBU = (query_cost / baseline_io_cost) * row_selectivity_weight

The scaling factors derive from empirical calibration against known workload profiles and are stored alongside schema metadata. How the resulting DBU correlates with observed wall-clock time is covered in depth in mapping EXPLAIN costs to real-world latency metrics.

Production-ready implementation

The normalization service runs as an asyncio worker. It fetches versioned coefficients through a pooled asyncpg connection to the signed config store, validates every payload, emits OpenTelemetry spans and metrics, and logs structured events through structlog. All math is bounded to prevent floating-point overflow, and version-lookup failures raise immediately rather than falling back to stale values.

PYTHON

import asyncio
from dataclasses import dataclass
from typing import Literal, Optional

import asyncpg
import structlog
from opentelemetry import metrics, trace
from pydantic import BaseModel, Field, ValidationError

log = structlog.get_logger(__name__)
tracer = trace.get_tracer(__name__)
meter = metrics.get_meter(__name__)

normalization_duration = meter.create_histogram(
    "db.cost_normalization.duration_ms",
    description="Wall-clock time to normalize one plan's engine costs",
    unit="ms",
)
normalization_failures = meter.create_counter(
    "db.cost_normalization.failures_total",
    description="Total normalization routing failures by reason",
)

DBU_OVERFLOW_CEILING = 1_000_000_000.0  # 1e9: any DBU above this is a planner anomaly


@dataclass(frozen=True)
class CalibrationCoefficients:
    baseline_seq_cost: float
    cpu_penalty_factor: float
    baseline_io_cost: float
    row_selectivity_weight: float


class PlanPayload(BaseModel):
    engine: Literal["postgresql", "mysql"]
    engine_version: str
    schema_version: str
    total_cost: float = Field(gt=0)
    estimated_rows: int = Field(ge=0)
    startup_cost: Optional[float] = Field(default=None, ge=0)

    @property
    def calibration_key(self) -> str:
        major, minor, *_ = (self.engine_version.split(".") + ["0"])[:3]
        return f"{self.engine}:{major}.{minor}"


class NormalizationError(Exception):
    def __init__(self, code: str, message: str) -> None:
        self.code = code
        super().__init__(f"{code}: {message}")


async def load_calibration(
    pool: asyncpg.Pool, key: str
) -> CalibrationCoefficients:
    """Fetch versioned coefficients from the signed config store. Fails fast."""
    row = await pool.fetchrow(
        """
        SELECT baseline_seq_cost, cpu_penalty_factor,
               baseline_io_cost, row_selectivity_weight
        FROM calibration_coefficients
        WHERE calibration_key = $1 AND signature_valid = true
        """,
        key,
    )
    if row is None:
        raise NormalizationError("ERR_VERSION_DRIFT", f"no calibration for {key}")
    return CalibrationCoefficients(**dict(row))


def _normalize(payload: PlanPayload, coeffs: CalibrationCoefficients) -> float:
    if payload.engine == "postgresql":
        dbu = (payload.total_cost / coeffs.baseline_seq_cost) * (
            1 + coeffs.cpu_penalty_factor
        )
    else:  # mysql — validated by the Literal type, so no unreachable branch
        if payload.estimated_rows == 0:
            raise NormalizationError("ERR_MISSING_STATS", "zero row estimate")
        selectivity = min(payload.estimated_rows / 1_000_000, 1.0)
        dbu = (payload.total_cost / coeffs.baseline_io_cost) * (
            selectivity * coeffs.row_selectivity_weight
        )

    if dbu > DBU_OVERFLOW_CEILING:
        raise NormalizationError("ERR_COST_OVERFLOW", f"dbu={dbu:.1f} exceeds ceiling")
    return round(dbu, 4)


async def normalize_plan(pool: asyncpg.Pool, raw: dict) -> float:
    with tracer.start_as_current_span("normalize_cost") as span:
        loop = asyncio.get_running_loop()
        started = loop.time()
        try:
            payload = PlanPayload.model_validate(raw)
        except ValidationError as exc:
            normalization_failures.add(1, {"reason": "ERR_MISSING_STATS"})
            log.warning("ingress_rejected", errors=exc.error_count())
            raise NormalizationError("ERR_MISSING_STATS", "schema validation failed")

        span.set_attribute("db.engine", payload.engine)
        span.set_attribute("db.engine_version", payload.engine_version)

        try:
            coeffs = await load_calibration(pool, payload.calibration_key)
            dbu = _normalize(payload, coeffs)
        except NormalizationError as err:
            normalization_failures.add(1, {"reason": err.code})
            span.set_attribute("error.code", err.code)
            log.error("normalization_failed", code=err.code, key=payload.calibration_key)
            raise
        finally:
            elapsed_ms = (loop.time() - started) * 1000
            normalization_duration.record(elapsed_ms, {"engine": payload.engine})

        log.info("normalized", key=payload.calibration_key, dbu=dbu)
        return dbu

The worker enforces type boundaries at ingress, records latency even on the failure path (via finally), and turns every distinct fault into a labelled counter increment plus a DLQ-bound NormalizationError. Because _normalize is pure and synchronous, it is trivially unit-testable against golden PostgreSQL and MySQL fixtures without a live database.

Threshold table

These SLOs govern the normalization stage in production. They are per-worker unless noted, measured over a rolling 5-minute window.

Metric	Pass	Warn	Block / page
`db.cost_normalization.duration_ms` p95	≤ 15 ms	> 15 ms	> 40 ms
`db.cost_normalization.duration_ms` p99	≤ 30 ms	> 30 ms	> 75 ms
Ingress rejection rate	≤ 0.5%	> 0.5%	> 2.0%
DLQ routing rate (all `ERR_*`)	≤ 0.1%	> 0.1%	> 1.0%
`ERR_VERSION_DRIFT` count	0	≥ 1	≥ 10 in 5 min
Calibration fetch p95	≤ 5 ms	> 5 ms	> 20 ms

The alerting rules bind those bands to the emitted OpenTelemetry counters. A representative Prometheus rule group:

YAML

groups:
  - name: cost_normalization
    rules:
      - alert: CostNormalizationLatencyHigh
        expr: histogram_quantile(0.95, sum(rate(db_cost_normalization_duration_ms_bucket[5m])) by (le)) > 40
        for: 10m
        labels: { severity: page }
        annotations:
          summary: "Cost normalization p95 above 40ms block band"
      - alert: CalibrationVersionDrift
        expr: increase(db_cost_normalization_failures_total{reason="ERR_VERSION_DRIFT"}[5m]) >= 10
        for: 0m
        labels: { severity: page }
        annotations:
          summary: "Missing calibration coefficients — plans routing to DLQ"
      - alert: CostOverflowSpike
        expr: increase(db_cost_normalization_failures_total{reason="ERR_COST_OVERFLOW"}[5m]) > 5
        for: 5m
        labels: { severity: ticket }
        annotations:
          summary: "Planner anomalies producing DBU above 1e9"

Failure scenarios and root cause analysis

1. Silent calibration drift after a minor upgrade. A 16.x → 17.x bump changes PostgreSQL’s cost model, but no coefficient set is published for the new major.minor key. Symptom: a spike in ERR_VERSION_DRIFT and a flood of DLQ payloads. Diagnose by confirming the missing key exists nowhere in the store:

SQL

SELECT calibration_key, signature_valid
FROM calibration_coefficients
WHERE calibration_key LIKE 'postgresql:17.%';

Mitigation: publish and sign the postgresql:17.0 coefficient set before draining the DLQ; never patch by copying 16.x values forward, as that reintroduces the drift the stage exists to prevent.

2. Zero-row MySQL estimates from stale table statistics. When information_schema statistics are stale, MySQL emits rows: 0 and the selectivity term collapses. Symptom: ERR_MISSING_STATS concentrated on one engine. Diagnose on the source instance:

SQL

SELECT table_name, update_time
FROM information_schema.tables
WHERE table_schema = 'app' AND update_time < NOW() - INTERVAL 7 DAY;

Mitigation: run ANALYZE TABLE on the affected tables upstream in the capture stage so downstream normalization receives non-zero cardinalities.

3. Cost overflow from a runaway cross join. A missing join predicate yields an astronomical total_cost, tripping the 1e9 ceiling. Symptom: ERR_COST_OVERFLOW for a specific query fingerprint. Diagnose by inspecting the offending plan tree in the DLQ:

BASH

kafka-console-consumer --topic cost_dlq --max-messages 1 \
  | jq 'select(.error_code=="ERR_COST_OVERFLOW") | {fingerprint, engine, total_cost}'

Mitigation: quarantine the fingerprint, alert the query owner, and keep the payload out of the baseline so the anomaly cannot skew downstream cost-delta tracking across baseline versions.

4. Config-store saturation during mass migration. A fleet-wide schema migration fans out thousands of new major.minor lookups at once. Symptom: rising calibration-fetch p95 and connection-pool exhaustion on the store. Diagnose with pool telemetry:

BASH

otel-cli span list --service cost-normalizer \
  --filter 'name="load_calibration" AND duration_ms>20' --last 5m

Mitigation: the circuit breaker (below) opens per calibration key, routing excess payloads to a retry queue with exponential backoff instead of hammering the store.

Configuration reference

Key tuning knobs, supplied as environment variables at worker start:

Variable	Default	Purpose
`COST_NORM_PARTITION_COUNT`	`64`	Ring size for the engine:major.minor partition key
`COST_NORM_DBU_CEILING`	`1000000000`	Overflow guard; DBU above this quarantines the plan
`COST_NORM_CONFIG_POOL_MIN`	`4`	`asyncpg` min connections to the signed config store
`COST_NORM_CONFIG_POOL_MAX`	`16`	`asyncpg` max connections to the config store
`COST_NORM_CALIBRATION_TTL_S`	`300`	In-worker cache TTL for signed coefficient sets
`COST_NORM_BREAKER_RATE`	`500`	Token-bucket refill rate (lookups/sec/key) before the breaker opens
`COST_NORM_DLQ_TOPIC`	`cost_dlq`	Kafka topic for failed payloads
`COST_NORM_EGRESS_TOPIC`	`cost_normalized`	Kafka topic for successful DBU vectors

Two safe-fallback protocols are non-negotiable. First, on a 5xx or timeout from the config store the worker routes to the DLQ with ERR_VERSION_DRIFT rather than defaulting to hardcoded coefficients — baseline integrity is prioritized over throughput. Second, the token-bucket breaker keyed by calibration key protects the store during migrations, deferring excess load to a retry queue instead of dropping it. Authoritative field definitions for the raw inputs live in the PostgreSQL EXPLAIN documentation and the MySQL EXPLAIN output format.

Mapping EXPLAIN Costs to Real-World Latency Metrics — calibrating DBU against observed wall-clock time.
Plan Hashing Algorithms for SQL Engines — the downstream consumer of normalized cost vectors.
Defining Regression Thresholds for Query Plans — how aligned DBU values feed statistical gates.
Normalizing Query Plans for Cross-Engine Comparison — the upstream capture step this stage consumes.
Tracking Cost Deltas Across Baseline Versions — where normalized costs become regression signals.

← Back to Core Architecture & Baselining Fundamentals

Architectural boundaries #

Deterministic routing and schema enforcement #

Production-ready implementation #

Threshold table #

Failure scenarios and root cause analysis #

Configuration reference #

Related #