Operating Review: aggregation contract¶
Status: Authoritative (CHAOS-1755)
Modes¶
The GraphQL operatingReview(orgId, input) resolver supports two modes,
selected by whether input.teamId is provided.
| Mode | input.teamId |
OperatingReview.teamId in response |
|---|---|---|
| Single team | "team-3" (any non-null string) |
"team-3" |
| All teams (cross-team aggregate) | null / omitted |
null |
Clients use the response teamId to decide labeling (e.g. render a team
name vs. an explicit "All Teams" badge). Do not infer aggregate mode
from the input alone; the response is the source of truth.
Per-metric aggregation rules (All Teams mode)¶
When teamId is null, the ClickHouse queries built by
build_operating_review_queries(team_id=None) drop the
AND team_id = %(team_id)s predicate and add team_id to the inner
GROUP BY so per-team rows are not collapsed by argMax(..., computed_at)
mid-aggregation. The outer aggregation function then determines the
cross-team behavior per metric.
Delivery movement¶
| Metric key | Aggregation across teams | Notes |
|---|---|---|
throughput (items_completed) |
SUM | Total org throughput. |
cycle_time_p50_hours |
AVG (unweighted) | Average of per-(provider, scope, team) p50s. Approximation — see "Known limitations" below. |
wip_count (wip_count_end_of_day) |
MAX per day across (provider, scope, team), then weekly behavior | Peaks rather than totals; see limitations. |
Bottleneck¶
| Metric key | Aggregation across teams | Notes |
|---|---|---|
state_duration_hours |
Weighted AVG (weight = items_touched) |
Done in Python at compute time; weights work correctly across teams because items_touched SUMs across them at the SQL layer. |
review_latency_hours (pr_first_review_p50_hours) |
AVG (unweighted) | Repo-scoped; team-agnostic at the row level. Unchanged across modes. |
wip_age_p90_hours |
AVG (unweighted) | Approximation — see limitations. |
Risk¶
| Metric key | Aggregation across teams | Notes |
|---|---|---|
hotspot_risk_score |
AVG | Already repo-scoped; team-agnostic. |
ownership_concentration |
AVG | Already repo-scoped; team-agnostic. |
complexity_per_kloc |
AVG | Already repo-scoped; team-agnostic. |
bus_factor |
MIN | Already repo-scoped; team-agnostic. |
Reliability¶
| Metric key | Aggregation across teams | Notes |
|---|---|---|
deployments_count |
SUM | Repo-scoped; unchanged across modes. |
change_failure_rate |
AVG of repo change_failure_rate |
Repo-scoped; unchanged across modes. |
incidents_count |
SUM | Repo-scoped; unchanged across modes. |
mttr_hours |
First non-zero of incidents.mttr_p50_hours then repo_metrics.mttr_hours, both AVG |
Repo-scoped; unchanged across modes. |
Investment¶
| Metric key | Aggregation across teams | Notes |
|---|---|---|
ktlo_units / new_value_units / security_units / infra_units (delivery_units) |
SUM | Total org investment per area. |
AI Workflow Intelligence¶
| Metric key | Aggregation across teams | Notes |
|---|---|---|
ai_adoption_ratio |
SUM AI-attributed PRs / SUM total PRs | Uses persisted AI attribution buckets. Unknown remains in the denominator. |
ai_cycle_time_delta_hours |
AVG | AI-attributed cycle-time delta from ai_impact_metrics_daily. Negative values mean lower AI-side cycle time. |
ai_review_amplification |
AVG | Review-pressure ratio from AI impact rollups. |
ai_risk_drag |
AVG of available risk rates | Combines rework, test-gap, and incident drag rates without creating a synthetic person score. |
ai_governance_coverage |
AVG coverage | Averages declaration, human-review, security-scan, and in-policy coverage from governance rollups. |
ai_opportunity_signals |
Rule count | Counts evidence-backed opportunity conditions for the week; clients should drill into aiOpportunities and aiWorkflowDrilldown for artifacts. |
The section is a weekly operating cadence for AI adoption: adoption mix, delivery impact, review pressure, risk drag, governance, and opportunities. It does not recompute attribution or inspect raw prompt/session data.
Improved / Worsened / Changed callouts¶
Computed exactly as in single-team mode: per-metric, comparing the current week's aggregated value to the prior week's aggregated value. No special-cased thresholds for aggregate mode.
Known limitations¶
-
Percentile approximations.
cycle_time_p50_hours,wip_age_p90_hoursare stored as already-aggregated per-team daily values. Averaging them across teams is an average-of-averages, not a true cross-team percentile. To compute a true cross-team percentile we would need raw item-level data, which is not currently materialised inwork_item_metrics_daily. -
WIP
MAXsemantics. Aggregate WIP is the peak single(provider, scope, team)WIP per day, not the sum across teams. This was the original single-team behavior and is preserved for consistency, but it means the all-teams WIP can read lower than the true total work in flight. -
Inner
GROUP BYextension. Whenteam_idis null we keepteam_idin the innerGROUP BYpurely soargMax(..., computed_at)continues to pick one canonical row per team per (day, provider, scope). Outer aggregation then combines correctly across teams. This is intentional and load-tested at fixture scale (10 teams × 30 days); for very large orgs the inner cardinality may need a different shape.
Files¶
src/dev_health_ops/metrics/operating_review.py—build_operating_review_queries(team_id=...)andcompute_operating_review.src/dev_health_ops/api/graphql/models/inputs.py—OperatingReviewInput.team_id: str | None.src/dev_health_ops/api/graphql/models/outputs.py—OperatingReview.team_id: str | None.src/dev_health_ops/api/graphql/resolvers/operating_review.py— threads optionalteam_idthrough_fetch_period_rows.tests/metrics/test_operating_review.py— covers both modes.
History¶
- CHAOS-1755: Introduced "All Teams" mode by making
team_idoptional and documenting per-metric aggregation rules. - CHAOS-1722: Added the AI Workflow Intelligence section to the weekly operating review and documented evidence handoff to AI recommendations / Work Graph drilldowns.
- CHAOS-1751: Established
teamsas the source of truth for the TEAM dimension catalog (separate from this contract).