Skip to content

OpenTelemetry Integration

Haiway provides seamless integration with OpenTelemetry for distributed tracing, metrics collection, and structured logging. This integration allows you to observe your applications with industry-standard tooling while maintaining Haiway's functional programming principles.

Note: OpenTelemetry is supported in Haiway ONLY through GRPC. We do not support HTTP. Use the gRPC port in your collector (for example port 4317 in OtelCollector).

Overview

The OpenTelemetry integration in Haiway bridges the framework's observability abstractions with the OpenTelemetry SDK, enabling:

  • Distributed Tracing: Automatic span creation and context propagation across async operations
  • Metrics Collection: Counter, histogram, and gauge metrics with custom attributes
  • Structured Logging: Context-aware logs correlated with traces
  • External Trace Linking: Connect to existing distributed traces from other services

Quick Start

1. Installation

The OpenTelemetry integration requires additional dependencies:

pip install haiway[opentelemetry]
# or manually:
pip install opentelemetry-api opentelemetry-sdk opentelemetry-exporter-otlp

2. Configuration

Configure OpenTelemetry once at application startup before creating any observability scopes:

from haiway.opentelemetry import OpenTelemetry

# Configure for local development (console output)...
OpenTelemetry.configure(
    service="my-service",
    version="1.0.0",
    environment="development"
)

# ...or for production (OTLP export)
OpenTelemetry.configure(
    service="my-service",
    version="1.0.0",
    environment="production",
    otlp_endpoint="http://jaeger:4317",
    insecure=True,
    export_interval_millis=5000,
    attributes={
        "team": "backend",
        "component": "api"
    }
)

Configuration Options

Basic Configuration

Parameter Type Description
service str Name of your service
version str Version of your service
environment str Deployment environment (e.g., "production", "staging")

OTLP Export Configuration

Parameter Type Default Description
otlp_endpoint str \| None None OTLP endpoint URL. If None, uses console exporters
insecure bool True Whether to use insecure connections
export_interval_millis int 5000 Metrics export interval in milliseconds
attributes Mapping[str, Any] \| None None Additional resource attributes

3. Usage with Context

Use the configured OpenTelemetry observability in your Haiway contexts:

from haiway import ctx
from haiway.opentelemetry import OpenTelemetry
import asyncio

async def main():
    # Use in context scope
    async with ctx.scope(
        "application",
        # Create an OpenTelemetry-backed observability adapter
        observability=OpenTelemetry.observability()
    ):
        await process_requests()

async def process_requests():
    # Automatic span creation and context propagation
    async with ctx.scope("request-processing"):
        ctx.log_info("Processing batch of requests")

        # Record custom metrics
        ctx.record_info(
            metric="requests.processed",
            value=10,
            kind="counter",
        )

        # Record custom events
        ctx.record_info(
            event="batch.started",
            attributes={
                "batch_size": 10,
                "priority": "high",
            },
        )

        await process_individual_requests()

async def process_individual_requests():
    # Nested spans are automatically created
    async with ctx.scope("individual-request"):
        ctx.record_info(
            attributes={
                "request.id": "req-123",
                "user.id": "user-456",
            },
        )

        # Simulated work
        await asyncio.sleep(0.1)

        ctx.record_info(
            metric="request.duration",
            value=100,
            unit="ms",
            kind="histogram",
        )

Console vs OTLP Export

Console Export: When no OTLP endpoint is specified, OpenTelemetry console exporters are used.

OpenTelemetry.configure(
    service="my-service",
    version="1.0.0",
    environment="development"
    # otlp_endpoint=None (default) uses console exporters
)

OTLP Export: With an OTLP endpoint provided, telemetry is sent to that endpoint and not mirrored to the console.

OpenTelemetry.configure(
    service="my-service",
    version="1.0.0",
    environment="production",
    otlp_endpoint="http://collector:4317"
)

Distributed Tracing

Automatic Span Creation

Haiway automatically creates OpenTelemetry spans for each context scope that uses an OpenTelemetry-backed observability instance:

async def handle_request():
    async with ctx.scope("http-request"):  # Creates span "http-request"
        async with ctx.scope("database-query"):  # Creates child span "database-query"
            await query_database()

        async with ctx.scope("external-api"):  # Creates child span "external-api"
            await call_external_service()

External Trace Linking

Connect to existing distributed traces from other services:

# Link to an external trace using a W3C traceparent header
incoming_traceparent = request.headers.get("traceparent")

observability = OpenTelemetry.observability(
    traceparent=incoming_traceparent
)

async with ctx.scope("service-handler", observability=observability):
    # This root span gets a link to the external trace/span context
    await handle_service_request()

traceparent= creates an OpenTelemetry Link on the root span. It does not install the remote span as the direct parent.

To propagate the current active span to another system, use:

traceparent = OpenTelemetry.traceparent()

Trace Context Propagation

Traces automatically propagate across async operations:

async def parent_operation():
    async with ctx.scope("parent"):
        # Start concurrent operations - they inherit trace context
        tasks = [
            asyncio.create_task(child_operation(i))
            for i in range(3)
        ]
        await asyncio.gather(*tasks)

async def child_operation(task_id: int):
    async with ctx.scope(f"child-{task_id}"):
        # Each child gets its own span under the parent trace
        await asyncio.sleep(0.1)

Metrics Collection

Metric Types

Haiway supports three metric kinds through the level-specific ctx.record_* helpers:

# Counter: Monotonically increasing values
ctx.record_info(metric="requests.total", value=1, kind="counter")

# Histogram: Distribution of values (e.g., latencies, sizes)
ctx.record_info(metric="request.duration", value=150, unit="ms", kind="histogram")

# Gauge: Point-in-time values that can go up or down
ctx.record_info(metric="active_connections", value=42, kind="gauge")

Metric Attributes

Add dimensional data to metrics:

ctx.record_info(
    metric="requests.processed",
    value=1,
    kind="counter",
    attributes={
        "method": "POST",
        "endpoint": "/api/users",
        "status": "success",
    },
)

Custom Units

Specify units for better observability:

ctx.record_info(metric="response.size", value=1024, unit="byte", kind="histogram")
ctx.record_info(metric="cpu.usage", value=75.5, unit="percent", kind="gauge")
ctx.record_info(metric="request.rate", value=150, unit="1/s", kind="gauge")

Structured Logging

Context-Aware Logging

Logs are automatically correlated with active spans:

async with ctx.scope("user-service"):
    ctx.log_info("Processing user request")  # Correlated with span

    try:
        user = await fetch_user(user_id)
        ctx.log_debug("User fetched successfully: %s", user_id)
    except UserNotFound as e:
        ctx.log_error("User not found: %s", user_id, exception=e)

Log Levels

Control the minimum observability level by setting level=. The threshold applies to logs, events, metrics, and attributes:

from haiway.context import ObservabilityLevel

# Only log warnings and errors
observability = OpenTelemetry.observability(level=ObservabilityLevel.WARNING)

async with ctx.scope("critical-operation", observability=observability):
    ctx.log_debug("This won't be recorded")  # Below threshold
    ctx.log_warning("This will be recorded")  # At or above threshold

Event Recording

Custom Events

Record significant events with structured attributes:

ctx.record_info(
    event="user.login",
    attributes={
        "user_id": "user-123",
        "login_method": "oauth",
        "client_ip": "192.168.1.100",
        "success": True,
    },
)

ctx.record_info(
    event="cache.miss",
    attributes={
        "cache_key": "user:profile:123",
        "ttl": 3600,
    },
)

Business Events

Track business-relevant events:

ctx.record_info(
    event="order.created",
    attributes={
        "order_id": "ord-789",
        "customer_id": "cust-456",
        "total_amount": 99.99,
        "currency": "USD",
        "items_count": 3,
    },
)

Advanced Usage

Custom Resource Attributes

Add service-specific metadata:

OpenTelemetry.configure(
    service="payment-service",
    version="2.1.0",
    environment="production",
    otlp_endpoint="http://collector:4317",
    attributes={
        "service.namespace": "payments",
        "service.instance.id": os.environ.get("INSTANCE_ID"),
        "deployment.version": "v2.1.0-rc.1",
        "team": "payments-team",
        "region": "us-east-1"
    }
)

Error Handling and Status

On scope exit, spans are marked ERROR when the scope exits with an exception and OK otherwise. If you also want the exception attached to the span, log it with exception=...:

async with ctx.scope("risky-operation"):
    try:
        await potentially_failing_operation()
        # Span status: OK
    except Exception as e:
        # Span status: ERROR
        # Exception details attached to the active span:
        ctx.log_error("Operation failed", exception=e)
        raise

Attribute Normalization

Before sending data to OpenTelemetry, Haiway normalizes observability attributes:

  • None and MISSING values are skipped
  • sequences are filtered and exported as tuples
  • mapping values are flattened into dotted keys such as http.request_id

Example:

ctx.record_info(
    attributes={
        "request": {
            "id": "req-123",
            "method": "POST",
        },
        "tags": ["api", "critical"],
        "optional": None,
    },
)

This produces span attributes equivalent to:

request.id=req-123
request.method=POST
tags=("api", "critical")

Jaeger

OpenTelemetry.configure(
    service="my-service",
    version="1.0.0",
    environment="production",
    otlp_endpoint="http://jaeger-collector:14250",
    insecure=True,
)

Prometheus + Grafana

OpenTelemetry.configure(
    service="my-service",
    version="1.0.0",
    environment="production",
    otlp_endpoint="http://otel-collector:4317",
    export_interval_millis=10000,  # 10 second export interval
)

SigNoz

For self-hosted SigNoz:

OpenTelemetry.configure(
    service="my-service",
    version="1.0.0",
    environment="production",
    otlp_endpoint="http://signoz-otel-collector:4317",
    insecure=True,
    export_interval_millis=5000,
)

Best Practices

1. Service Naming

Use consistent service names across your organization:

# Good: Consistent with service discovery
OpenTelemetry.configure(service="user-service", ...)

# Avoid: Inconsistent naming
OpenTelemetry.configure(service="userSvc", ...)

2. Meaningful Span Names

Use descriptive span names that indicate the operation:

# Good: Descriptive operation names
async with ctx.scope("validate-user-permissions"):
    ...

async with ctx.scope("fetch-user-profile"):
    ...

# Avoid: Generic names
async with ctx.scope("operation"):
    ...

3. Attribute Consistency

Use consistent attribute names across your services:

# Good: Consistent attribute naming
ctx.record_info(
    attributes={
        "user.id": user_id,
        "user.role": user_role,
        "request.id": request_id,
    },
)

# Establish naming conventions:
# - Use dots for namespacing
# - Use snake_case for attribute names
# - Use consistent prefixes (user., request., etc.)

Troubleshooting

Common Issues

1. No telemetry data appearing

  • Verify OTLP endpoint is reachable
  • Check if OpenTelemetry.configure() was called before creating observability
  • Check that the haiway[opentelemetry] extra is installed
  • Ensure proper network connectivity to your observability backend

2. High memory usage

  • Consider increasing export intervals
  • Check if you're creating too many unique metric label combinations
  • Review span attribute cardinality

3. Missing trace correlation

  • Ensure observability is properly passed through context scopes
  • Verify the supplied traceparent value is a valid W3C traceparent string
  • Check that async context is properly propagated

Further Reading