Skip to content

Monitoring & Observability

Integra provides built-in observability via Prometheus metrics and OpenTelemetry tracing.


Prometheus Metrics

Exposed at /metrics (standard Prometheus format).

Key Metrics

Metric Type Description
http_requests_total Counter Total requests by status/method/path.
http_request_duration_seconds Histogram Request latency distribution.
integra_parsing_duration_seconds Histogram Time spent parsing AL3 (excluding network/JSON overhead).
integra_validation_errors_total Counter Count of validation failures.

Dashboard (Grafana)

Recommended panels: 1. Request Rate: rate(http_requests_total[1m]) 2. Error Rate (4xx/5xx): rate(http_requests_total{status=~"4..|5.."}[1m]) 3. P99 Latency: histogram_quantile(0.99, rate(http_request_duration_seconds_bucket[5m]))


OpenTelemetry Tracing

Integra exports traces via OTLP (gRPC).

Configuration

export INTEGRA_TELEMETRY_ENABLED=true
export OTEL_EXPORTER_OTLP_ENDPOINT=http://otel-collector:4317
export OTEL_SERVICE_NAME=integra-prod

Trace Structure

  • Root Span: POST /v1/parse
  • Child: al3.Parse (The core parsing logic)
  • Child: json.Marshal (Response generation)

Use traces to identify bottlenecks in specific complex AL3 files.


Health Checks

Orchestrators should monitor /health. - 200 OK: Service is ready. - Fail: Service is stuck or shutting down.