Routing EXPLAIN ANALYZE Output to Centralized Logs

Routing EXPLAIN ANALYZE output to centralized logs is the transport stage that moves fully materialized plan payloads from capture agents into durable, query-able logging infrastructure without touching query execution paths or performing any analytical work. This stage is deliberately narrow: it consumes signed plan envelopes from upstream collectors and emits schema-validated, correctly-partitioned JSON events to a centralized log store. It does not parse execution trees, compute cost metrics, or raise regression alerts — those responsibilities belong to downstream stages. For platform teams running multi-tenant database fleets, a correctly-scoped router guarantees lossless delivery, enforces a strict envelope contract, and stays completely isolated from the database host, so a logging outage can never degrade production latency.

Architectural Boundaries

The router sits between plan capture and storage in the Automated EXPLAIN Capture & Storage Workflows pipeline. It has exactly one upstream dependency and one downstream contract, and nothing flows backward across either edge.

Upstream (consumed): the router accepts only fully materialized EXPLAIN ANALYZE envelopes produced by the low-overhead capture layer described in Capturing EXPLAIN Plans Without Impacting Production Performance. Each envelope is already normalized and fingerprinted upstream; the router treats it as an immutable telemetry object and never re-derives the plan hash. When capture runs at fleet scale, envelopes arrive over the async transport detailed in Building Async Ingestion Pipelines for High-Throughput Queries.

Downstream (emitted): the router writes structured events to a centralized log index (Elasticsearch/OpenSearch, Loki, or a Kafka topic feeding object storage). Storage and analytical consumers — including the normalization engine in Normalizing Query Plans for Cross-Engine Comparison and the scorers in Regression Detection & Rule Engines — subscribe exclusively to the routed output stream. A malformed envelope must never reach them, so validation happens synchronously at the ingress boundary and failures divert to a dead-letter queue rather than mutating the payload in place.

The isolation boundary is the whole point: the router runs out-of-process from the database, buffers under backpressure, and degrades to local disk, so transport failures stay contained on the router and never cascade into either storage corruption or database latency.

Deterministic Routing and Schema Enforcement

Routing decisions must be stateless, reproducible, and driven by explicit envelope metadata. Dynamic routing on runtime heuristics introduces non-determinism that makes incident reconstruction impossible, so the dispatcher evaluates only static attributes present in the payload at ingestion time.

Envelope contract

Every event is validated against a fixed JSON Schema at the ingress boundary. The contract is shared with the metadata validation stage in Schema Validation for Baseline Metadata, so a change to required fields is a coordinated, versioned event rather than a silent drift.

JSON

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "title": "ExplainEnvelope",
  "type": "object",
  "required": ["query_hash", "execution_timestamp", "cluster_id",
               "plan_text", "actual_rows", "total_time_ms"],
  "additionalProperties": false,
  "properties": {
    "query_hash":        { "type": "string", "pattern": "^[a-f0-9]{64}$" },
    "execution_timestamp": { "type": "number", "minimum": 0 },
    "cluster_id":        { "type": "string", "minLength": 1 },
    "region":            { "type": "string", "minLength": 1 },
    "query_type":        { "type": "string", "enum": ["DDL", "DML", "SELECT", "OTHER"] },
    "plan_text":         { "type": "string", "minLength": 1 },
    "actual_rows":       { "type": "integer", "minimum": 0 },
    "total_time_ms":     { "type": "number", "minimum": 0 }
  }
}

The query_hash is the 64-character SHA-256 fingerprint produced by the capture layer using the plan hashing approach; the router validates its shape but never recomputes it.

Routing dimensions

The dispatcher evaluates a fixed precedence chain over static envelope attributes:

Query classification: DDL payloads route to a separate index with its own retention policy, keeping schema-change plans out of the hot query workload stream.
Latency and cardinality thresholds: envelopes exceeding a critical execution window or row count are tagged critical and routed to a high-priority index for expedited delivery to SRE channels. These thresholds share their numeric definitions with Defining Regression Thresholds for Query Plans so a plan that trips the router’s critical band is also flagged downstream.
Cluster topology / data residency: region selects a region-scoped index so plan text — which can contain sensitive literals — stays within its jurisdiction, complementing the controls in Security Boundaries for Baseline Data Storage.
Sampling rate control: deterministic modulo hashing on query_hash caps ingestion volume during traffic spikes.

Partition key and sampling formulas

Both formulas are pure functions of the envelope, so identical plans always route identically across process restarts — a prerequisite for reproducible incident reconstruction.

partition_key = f"{region}:{route_target}:{utc_date(execution_timestamp)}"
sample_admit  = int(sha256(query_hash), 16) % 100 < sample_rate_pct

Consistent hashing on query_hash (rather than a random draw) guarantees that once a query is sampled in, every subsequent execution of the same fingerprint is also admitted, keeping per-query plan history contiguous instead of riddled with sampling gaps.

Production-Ready Implementation

The following router is async end-to-end. It validates against the envelope contract, applies deterministic sampling, evaluates the route target, batches events, ships them over a bounded connection pool, and degrades to local disk behind a circuit breaker. It uses structlog for structured logs and OpenTelemetry for spans and metrics, so every routing decision is observable in the same telemetry backend the plans themselves flow into.

PYTHON

import asyncio
import hashlib
import json
from dataclasses import dataclass
from datetime import datetime, timezone
from enum import Enum
from typing import Optional

import aiohttp
import structlog
from opentelemetry import metrics, trace
from pydantic import BaseModel, Field, ValidationError

log = structlog.get_logger("explain_router")
tracer = trace.get_tracer("explain_router")
meter = metrics.get_meter("explain_router")

INGESTED = meter.create_counter("explain_router_ingested_total")
DROPPED = meter.create_counter("explain_router_dropped_total")
DLQ = meter.create_counter("explain_router_dlq_total")
QUEUE_DEPTH = meter.create_up_down_counter("explain_router_queue_depth")
FLUSH_LATENCY = meter.create_histogram(
    "explain_router_batch_flush_ms", unit="ms")


class RouteTarget(str, Enum):
    CRITICAL = "critical-plans"
    BASELINE = "baseline-plans"
    DDL = "ddl-plans"


class ExplainEnvelope(BaseModel):
    query_hash: str = Field(pattern=r"^[a-f0-9]{64}$")
    execution_timestamp: float = Field(ge=0)
    cluster_id: str = Field(min_length=1)
    region: str = "default"
    plan_text: str = Field(min_length=1)
    actual_rows: int = Field(ge=0)
    total_time_ms: float = Field(ge=0)
    query_type: Optional[str] = None


@dataclass
class RoutingConfig:
    critical_time_ms: float = 5000.0
    critical_rows: int = 1_000_000
    sample_rate_pct: int = 100
    batch_size: int = 50
    flush_interval_s: float = 2.0
    queue_maxsize: int = 10_000
    circuit_cooldown_s: float = 30.0
    endpoint_timeout_s: float = 5.0


class ExplainRouter:
    def __init__(self, config: RoutingConfig, log_endpoint: str):
        self.config = config
        self.log_endpoint = log_endpoint
        self._queue: asyncio.Queue = asyncio.Queue(maxsize=config.queue_maxsize)
        self._circuit_open = False
        self._session: Optional[aiohttp.ClientSession] = None

    async def _init_session(self) -> None:
        self._session = aiohttp.ClientSession(
            timeout=aiohttp.ClientTimeout(total=self.config.endpoint_timeout_s),
            connector=aiohttp.TCPConnector(limit=20),
        )

    def _evaluate_route(self, env: ExplainEnvelope) -> RouteTarget:
        if env.query_type == "DDL":
            return RouteTarget.DDL
        if (env.total_time_ms > self.config.critical_time_ms
                or env.actual_rows > self.config.critical_rows):
            return RouteTarget.CRITICAL
        return RouteTarget.BASELINE

    def _admit_sample(self, env: ExplainEnvelope) -> bool:
        if self.config.sample_rate_pct >= 100:
            return True
        digest = int(hashlib.sha256(env.query_hash.encode()).hexdigest(), 16)
        return (digest % 100) < self.config.sample_rate_pct

    @staticmethod
    def _partition_key(env: ExplainEnvelope, target: RouteTarget) -> str:
        day = datetime.fromtimestamp(
            env.execution_timestamp, tz=timezone.utc).strftime("%Y-%m-%d")
        return f"{env.region}:{target.value}:{day}"

    async def ingest(self, raw_json: str) -> bool:
        with tracer.start_as_current_span("router.ingest"):
            try:
                env = ExplainEnvelope.model_validate_json(raw_json)
            except ValidationError as exc:
                DLQ.add(1)
                log.warning("schema_violation", errors=exc.error_count())
                await self._send_to_dlq(raw_json)
                return False

            if not self._admit_sample(env):
                DROPPED.add(1, {"reason": "sampled_out"})
                return True

            target = self._evaluate_route(env)
            event = {
                **env.model_dump(),
                "route_target": target.value,
                "partition_key": self._partition_key(env, target),
                "ingestion_timestamp": datetime.now(timezone.utc).timestamp(),
            }
            try:
                self._queue.put_nowait(event)
                QUEUE_DEPTH.add(1)
            except asyncio.QueueFull:
                DROPPED.add(1, {"reason": "queue_full"})
                log.error("queue_full_drop_oldest", query_hash=env.query_hash)
                return False
            INGESTED.add(1, {"route_target": target.value})
            return True

    async def _drain_batch(self) -> list[dict]:
        batch: list[dict] = []
        while len(batch) < self.config.batch_size:
            try:
                batch.append(
                    await asyncio.wait_for(self._queue.get(), timeout=0.1))
                QUEUE_DEPTH.add(-1)
            except asyncio.TimeoutError:
                break
        return batch

    async def _flush_batch(self) -> None:
        batch = await self._drain_batch()
        if not batch:
            return
        if self._circuit_open:
            await self._spill_to_disk(batch)
            return

        start = asyncio.get_event_loop().time()
        with tracer.start_as_current_span("router.flush") as span:
            span.set_attribute("batch.size", len(batch))
            try:
                async with self._session.post(
                    self.log_endpoint,
                    data=json.dumps(batch),
                    headers={"Content-Type": "application/json"},
                ) as resp:
                    resp.raise_for_status()
            except (aiohttp.ClientError, asyncio.TimeoutError) as exc:
                self._circuit_open = True
                log.error("endpoint_failure_open_circuit", error=str(exc))
                await self._spill_to_disk(batch)
            finally:
                FLUSH_LATENCY.record(
                    (asyncio.get_event_loop().time() - start) * 1000)

    async def _send_to_dlq(self, raw: str) -> None:
        ...  # publish to Kafka DLQ topic or object-store fallback bucket

    async def _spill_to_disk(self, batch: list[dict]) -> None:
        ...  # append to local JSONL with atomic rename; watcher replays on recovery

    async def run(self) -> None:
        await self._init_session()
        try:
            while True:
                await self._flush_batch()
                await asyncio.sleep(self.config.flush_interval_s)
                if self._circuit_open:
                    await asyncio.sleep(self.config.circuit_cooldown_s)
                    self._circuit_open = False
                    log.info("circuit_half_open_retry")
        finally:
            if self._session:
                await self._session.close()

The design prioritizes backpressure and graceful degradation over synchronous blocking calls: the bounded queue caps memory, the circuit breaker halts egress on repeated failure, and disk spillover keeps in-flight batches durable until the endpoint recovers. When the DLQ and spillover are backed by a durable broker, the same partitioning discipline described in Using Kafka for Async Query Plan Ingestion at Scale applies to the replay path.

Threshold Table

These are the routing and transport SLOs the stage is held to. Latency budgets are measured at the router, not the log store, because the router’s contract is only to accept, decide, and hand off.

Metric	Pass	Warn	Block / Alert	Automation trigger
`route_evaluation_duration_ms` (p95)	≤ 2 ms	2–10 ms	> 10 ms	page on-call; dispatcher CPU-bound
`explain_router_batch_flush_ms` (p95)	≤ 250 ms	250–800 ms	> 800 ms	scale log index / open circuit early
`explain_router_queue_depth`	≤ 2 000	2 000–8 000	> 8 000 (of 10 000 cap)	throttle capture sampling
`explain_router_dropped_total` rate	0 /min	≤ 5 /min	> 5 /min	investigate queue saturation
`explain_router_dlq_total` rate	0 /min	≤ 1 /min	> 1 /min	envelope contract drift — check upstream
Critical-plan delivery lag	≤ 5 s	5–30 s	> 30 s	expedite `critical-plans` flush
Spillover disk utilization	≤ 50 %	50–70 %	> 70 %	replay backlog / expand volume

A minimal Prometheus alert wiring the queue-saturation and DLQ-drift breaches:

YAML

groups:
  - name: explain-router
    rules:
      - alert: ExplainRouterQueueSaturated
        expr: explain_router_queue_depth > 8000
        for: 2m
        labels: { severity: warning }
        annotations:
          summary: "Router queue >80% of cap on "
          runbook: "throttle capture sample_rate_pct; check log endpoint p95"
      - alert: ExplainRouterEnvelopeDrift
        expr: rate(explain_router_dlq_total[5m]) > (1 / 60)
        for: 5m
        labels: { severity: critical }
        annotations:
          summary: "Envelopes failing schema validation — upstream contract drift"
          runbook: "diff ExplainEnvelope against schema-validation stage"

Failure Scenarios and Root Cause Analysis

1. Envelope contract drift. Symptom: explain_router_dlq_total climbs while ingested_total falls; downstream indices go stale for specific clusters. Root cause: the capture layer added or renamed a field without a coordinated schema version bump. Diagnostics: pull a rejected payload from the DLQ and diff it against the contract — jq 'keys' rejected.json and compare to the required list. Mitigation: pin the envelope schema version in both stages; never make required-field changes without updating Schema Validation for Baseline Metadata in the same release.

2. Log endpoint brownout. Symptom: batch_flush_ms p95 spikes past 800 ms, circuit opens, spillover disk fills. Root cause: the centralized index is under-provisioned for the current ingest rate or is rejecting oversized bulk requests. Diagnostics: curl -s $LOG_ENDPOINT/_cluster/health | jq .status for OpenSearch; check the router logs for endpoint_failure_open_circuit events. Mitigation: lower batch_size, add index shards, and confirm the circuit cooldown is long enough to avoid thundering-herd retries.

3. Queue saturation under capture spikes. Symptom: queue_depth pins near the 10 000 cap and dropped_total{reason="queue_full"} rises. Root cause: capture volume exceeds sustained flush throughput — often a traffic surge or a runaway query storm. Diagnostics: correlate queue_depth with upstream capture rate; check explain_router_dropped_total by reason label. Mitigation: reduce sample_rate_pct at the capture layer so admission is deterministic, and scale flush concurrency; the drop-oldest policy protects memory but logs each dropped query_hash for audit.

4. Sampling skew hiding a regression. Symptom: a known-slow query never appears in baseline-plans despite firing constantly. Root cause: sample_rate_pct combined with the modulo hash deterministically excludes that fingerprint. Diagnostics: compute int(sha256(query_hash),16) % 100 for the missing hash and compare to sample_rate_pct. Mitigation: exempt critical-tagged plans from sampling entirely, and route any envelope over the latency threshold before the sampling gate.

5. Non-deterministic partitioning after a config edit. Symptom: the same query’s plans scatter across multiple daily partitions. Root cause: region or the timestamp source changed between deploys, or timezone handling drifted from UTC. Diagnostics: verify execution_timestamp is UTC epoch seconds and that region is stable per cluster. Mitigation: derive partition_key purely from the envelope (as in the implementation) and never from wall-clock time at ingestion.

Configuration Reference

Every knob is an environment variable mapped onto RoutingConfig. Defaults are production-safe for a single mid-sized fleet; tune sample_rate_pct and batch_size first when scaling.

Env var	Config field	Default	Purpose
`ROUTER_CRITICAL_TIME_MS`	`critical_time_ms`	`5000.0`	Latency above which a plan routes to `critical-plans`
`ROUTER_CRITICAL_ROWS`	`critical_rows`	`1000000`	Row count above which a plan is tagged critical
`ROUTER_SAMPLE_RATE_PCT`	`sample_rate_pct`	`100`	Deterministic admission percentage on `query_hash`
`ROUTER_BATCH_SIZE`	`batch_size`	`50`	Events per bulk flush to the log endpoint
`ROUTER_FLUSH_INTERVAL_S`	`flush_interval_s`	`2.0`	Idle flush cadence when a batch is not full
`ROUTER_QUEUE_MAXSIZE`	`queue_maxsize`	`10000`	Bounded in-memory queue cap (drop-oldest on breach)
`ROUTER_CIRCUIT_COOLDOWN_S`	`circuit_cooldown_s`	`30.0`	Egress pause after the circuit opens
`ROUTER_ENDPOINT_TIMEOUT_S`	`endpoint_timeout_s`	`5.0`	Per-request timeout to the centralized log store
`ROUTER_LOG_ENDPOINT`	`log_endpoint`	—	Bulk ingest URL for the centralized index

Capturing EXPLAIN Plans Without Impacting Production Performance — the upstream capture layer that produces the envelopes this stage routes.
Building Async Ingestion Pipelines for High-Throughput Queries — backpressure and consumer-group patterns for the transport this router feeds.
Normalizing Query Plans for Cross-Engine Comparison — the downstream consumer that reads the routed output stream.
Schema Validation for Baseline Metadata — the shared envelope contract this stage enforces at ingress.
Regression Detection & Rule Engines — the scoring stages that subscribe to routed plans.

← Back to Automated EXPLAIN Capture & Storage Workflows

Architectural Boundaries #

Deterministic Routing and Schema Enforcement #

Envelope contract #

Routing dimensions #

Partition key and sampling formulas #

Production-Ready Implementation #

Threshold Table #

Failure Scenarios and Root Cause Analysis #

Configuration Reference #

Related #