GraphQL security posture¶
The /graphql endpoint is locked down by default. Production environments disable the browser IDE, disable schema introspection, apply validation limits, and reject oversized request bodies before Strawberry parses the query.
1. Introspection and GraphiQL IDE controls¶
Production environments restrict access to GraphQL schema information and the interactive interface.
- Introspection: Schema discovery via introspection fields like
__schemais disabled outside development by default. - Environment variable:
GRAPHQL_INTROSPECTION_ENABLED=trueexplicitly allows introspection. If unset, introspection is disabled in production. - Default posture: Disabled outside local development.
- Implementation: Enforced by the
is_graphql_introspection_enabledhelper (src/dev_health_ops/api/graphql/security.py:88) and the custom ruleNoSchemaIntrospectionCustomRulefromgraphql.validation.rules.custom.no_schema_introspection(registered onsrc/dev_health_ops/api/graphql/security.py:162). - GraphiQL IDE: The interactive in-browser developer tool is hidden in production.
- Environment variable:
GRAPHQL_IDE_ENABLED=trueexplicitly exposes GraphiQL. - Default posture: Disabled outside local development.
- Implementation: Controlled via
is_graphql_ide_enabled(src/dev_health_ops/api/graphql/security.py:79) and passed toGraphQLRouter(src/dev_health_ops/api/graphql/app.py:134).
Local development environments are identified by checking the ENVIRONMENT, APP_ENV, or ENV environment variables against the set {"development", "dev", "local"} (src/dev_health_ops/api/graphql/security.py:67). If these variables are unset, the system defaults to production.
2. Request body size limit¶
Unreasonable payloads are rejected at the ASGI layer before any query parsing occurs.
- Hard body limit: Request payloads are limited to a maximum of 16 KB by default.
- Environment variable:
GRAPHQL_MAX_QUERY_BYTESsets the HTTP request body size limit in bytes. - Default value:
16384bytes (16 KB), defined asDEFAULT_GRAPHQL_MAX_QUERY_BYTESinsrc/dev_health_ops/api/graphql/security.py:26. - Implementation: Enforced by the custom
GraphQLQuerySizeLimitMiddlewareclass (src/dev_health_ops/api/graphql/security.py:179). This middleware is registered insrc/dev_health_ops/api/_middleware.py:60. - Failure mode: Payloads exceeding the byte limit are immediately terminated. The server returns HTTP status
413 Payload Too Largewith the JSON response:{ "detail": { "message": "GraphQL request body exceeds size limit", "limit_bytes": 16384 } } - Probe rejection: If a
GETrequest is sent to/graphqland GraphiQL is disabled, the size limit middleware intercepts the request and responds with a standard HTTP404 Not Foundto reject scans (src/dev_health_ops/api/graphql/security.py:225).
3. Query validation limits¶
To prevent query complexity attacks and resource exhaustion, the API registers two AST validation rules during schema compilation.
- Query depth validation:
- Limit: Maximum query selection depth of 12.
- Implementation: Evaluated by the
MaxDepthLimitAST validation rule (src/dev_health_ops/api/graphql/security.py:97). - Failure mode: Standard GraphQL validation error with message:
GraphQL query depth exceeds limit of 12. - Field alias validation:
- Limit: Maximum of 15 aliases per operation.
- Implementation: Checked by the
MaxAliasLimitAST validation rule (src/dev_health_ops/api/graphql/security.py:127). - Failure mode: Standard GraphQL validation error with message:
GraphQL alias count exceeds limit of 15.
Hardening rules can be disabled for local diagnostics by setting GRAPHQL_SECURITY_ENABLED=false (src/dev_health_ops/api/graphql/security.py:73). If unset, hardening remains active in production. These validation rules are registered on the Strawberry schema instance (src/dev_health_ops/api/graphql/schema.py:299).
4. Resolver-level cost limits¶
Valid queries with high analytical costs are evaluated by cost budgets prior to ClickHouse query compilation. This layer guards against database abuse.
- Limit definitions: Defined in
src/dev_health_ops/api/graphql/cost.py:12through theCostLimitsclass. - Maximum date range:
3650days (10 years) - Maximum timeseries buckets:
100buckets - Maximum breakdown items (top_n):
100items - Maximum Sankey nodes:
100nodes - Maximum Sankey edges:
500edges - Maximum sub-requests in a single batch:
10sub-requests - Query timeout:
30seconds - Validation functions:
validate_date_range(start_date, end_date)checks the date range length.validate_top_n(top_n)limits top_n query requests.validate_sankey_limits(max_nodes, max_edges)checks Sankey node and edge caps.validate_sub_request_count(timeseries_count, breakdowns_count, has_sankey, has_flow_matrix)restricts batch sizes.validate_buckets(start_date, end_date, interval)estimates bucket counts before execution.- Failure mode: On limit violation, resolvers raise a
CostLimitExceededError. This surfaces as a standard GraphQL error block:{ "errors": [ { "message": "Date range of 4000 days exceeds limit of 3650", "extensions": { "code": "COST_LIMIT_EXCEEDED", "limit_name": "max_days", "limit_value": 3650, "requested_value": 4000 } } ] }
5. Field-level authorization and scoping¶
Scoping constraints and permissions are enforced at both execution and resolver boundaries.
- Authentication requirements:
- Users must present a valid JWT in the
Authorizationheader. - Override:
GRAPHQL_AUTH_REQUIRED=falsecan bypass auth check in local dev environments (src/dev_health_ops/api/graphql/app.py:46). - If unauthenticated, the request context setup raises a FastAPI
HTTPExceptionwith status401 Unauthorized(src/dev_health_ops/api/graphql/app.py:78). - Org-ID boundary scoping:
- Consistently scopes database queries to prevent cross-tenant data leaks.
- Enforced at the execution layer by the
OrgIdAuthExtensionclass (src/dev_health_ops/api/graphql/extensions.py:19). - If the variables map supplies a requested
org_id, the extension validates it against the authenticated user's JWTorg_idcontext before resolving any fields. A mismatch raises anAuthorizationError(src/dev_health_ops/api/graphql/extensions.py:63). - Resolver-level permissions:
- Resolvers can be annotated with the
@require_permission(*permissions)decorator (src/dev_health_ops/api/graphql/authz.py:15). - This extracts user context and verifies role privileges via
has_any_permissionorhas_all_permissionsin the permissions service. - A permissions violation throws an
AuthorizationError.
6. Rate limiting¶
The API uses slowapi for general endpoint protection.
- GraphQL middleware coverage:
- The GraphQL endpoint
/graphqlis mounted as a standard router on the FastAPI application (src/dev_health_ops/api/main.py:189). - The
SlowAPIMiddlewarewraps the FastAPI app to support rate limits (src/dev_health_ops/api/_middleware.py:73). - Because
/graphqldoes not carry explicit@limiter.limitdecorators, it currently bypasses rate limit constraints. This avoids introducing query-level friction on complex dashboard loads. - Rate-limit configuration and Redis storage setup reside in
src/dev_health_ops/api/middleware/rate_limit.py.
7. Performance and N+1 prevention¶
Complex resolvers utilize batching loaders to eliminate N+1 query patterns.
- Context DataLoaders:
- Scoped DataLoaders are constructed per request (
src/dev_health_ops/api/graphql/context.py:98). - Core loaders:
team_loaderandteam_by_name_loaderbatch team retrieval.repo_loaderandrepo_by_name_loaderbatch repository lookup.loaders(DataLoaders.create(client)) batches analytics queries, grouping timeseries and breakdown queries by dimension, measure, and interval.
- High-frequency resolvers, including
analyticsandwork_graph, load through these interfaces to keep database roundtrips minimal.
8. Audit logging and tracing¶
To support audits, GraphQL operations emit telemetry.
- Access logging:
- Every GraphQL request flows through the FastAPI
CorrelationIdMiddleware(src/dev_health_ops/api/_middleware.py:76). - This middleware injects a unique trace identifier (
X-Request-ID) into logging contexts. - The context factory emits debug info upon context initialization (
src/dev_health_ops/api/graphql/app.py:57). - Distributed tracing:
- OpenTelemetry: Integrated via
instrument_fastapi_app(app)insrc/dev_health_ops/api/_observability.py:32. - OpenTelemetry tracks individual resolver execution spans and database execution times.
- Error tracking:
- Sentry: Captures unexpected execution errors via
init_sentry()(src/dev_health_ops/api/main.py:22).
9. How to audit this in code¶
Review these canonical source files to verify the security posture details:
src/dev_health_ops/api/graphql/security.py: Validation rules, depth and alias validators, request size limits, and environment checkers.src/dev_health_ops/api/graphql/app.py: Request context setup, GraphiQL mounting configuration, and authorization controls.src/dev_health_ops/api/graphql/cost.py: Cost budget ranges, batch constraints, and query validations.src/dev_health_ops/api/graphql/authz.py: Resolver permission decorators and error rules.src/dev_health_ops/api/graphql/extensions.py: Org scoping validation hook.src/dev_health_ops/api/graphql/context.py: Batching loader registrations.src/dev_health_ops/api/_middleware.py: Middleware stack order and custom HTTP size middleware mounting.