Skip to content

ADR-0019: Unified Service Infrastructure — SilvaService Pattern

Status: Accepted • Date: 2026-02-21

AS-IS Update (2026-03-31): The architecture has since evolved into a dual-pattern: - SilvaService: Canonical base class for background workers and immutable services. - ServiceContext: Modern primitive for HTTP/FastAPI services utilizing FastAPI lifespans directly without subclassing SilvaService.

Mentions of "current 3 services" reflect the historical state at the time of writing. Furthermore, core modules like silvasonic.core.redis are now fully encapsulated by the Context.

1. Context & Problem

Silvasonic consists of 13 services across two tiers. Each service needs:

  • Health monitoring (HTTP endpoint for Podman probes)
  • Status reporting (Redis heartbeat for Web-Interface)
  • Structured logging
  • Graceful shutdown (SIGTERM/SIGINT)
  • Pydantic-based configuration

Without a unified pattern, each service implements these concerns independently, leading to inconsistency, code duplication, and higher maintenance cost. The risk grows as the service count increases from the current 3 (database, controller, recorder) to the planned 13.

2. Decision

We chose: A SilvaService base class in silvasonic.core.service that provides the canonical lifecycle for background workers and immutable Python services.

2.1. Service Classification

Category Services Behavior at Runtime
Immutable Recorder, Uploader, BirdNET, BatDetect, Weather, Processor Config injected at start (env vars / DB read on init). No runtime commands. Restart to reconfigure.
Mutable Controller, Web-Interface Maintain and change state at runtime. React to events and user input.
Infrastructure Database, Redis, Gateway, Icecast, Tailscale External services managed by Compose/Quadlets. Not Python services.

[!NOTE] The Processor is classified as immutable despite being Tier 1. Its behavior (polling interval, retention thresholds) is configured at startup. Changing these parameters requires a container restart — identical to the Recorder pattern.

2.2. Unified Lifecycle

Every Python service follows this exact sequence:

async def main() -> None:
    service = SilvaService(
        name="recorder",
        instance_id="ultramic-01",  # Singletons: instance_id = name
        port=9500,
    )

    # 1. Logging — MUST be first
    service.configure_logging()

    # 2. Health Server — HTTP /healthy on :port (Podman/Compose probes)
    service.start_health_server()

    # 3. Redis Connection — best-effort, non-blocking
    #    If Redis is unreachable: logs warning, continues without heartbeat
    await service.connect_redis()

    # 4. Heartbeat Loop — fire-and-forget, periodic
    #    Two Redis operations per heartbeat (both with 50ms timeout):
    #      SET silvasonic:status:<instance_id> <payload> EX <TTL>
    #      PUBLISH silvasonic:status <payload>
    service.start_heartbeat()

    # 5. Service-specific logic (override)
    await record_audio()

    # 6. Graceful Shutdown — SIGTERM + SIGINT
    await service.wait_for_shutdown()

2.3. Two Independent Health Channels

Every service exposes health via two channels that read from the same HealthMonitor singleton:

Channel Transport Consumer Purpose
HTTP /healthy HTTP (synchronous) Podman/Compose Container orchestration: "should this container restart?"
Redis Heartbeat Redis (async, fire-and-forget) Web-Interface Application status: "what is this service doing?"

[!IMPORTANT] HTTP health is mandatory — Podman healthchecks require HTTP. Redis heartbeats are the best-effort complement for rich, push-based, aggregated status. These are not redundant channels; they serve different consumers with different requirements.

2.4. Heartbeat Payload Schema

All heartbeats use the same JSON schema:

{
  "service": "recorder",
  "instance_id": "ultramic-01",
  "timestamp": 1706612400.123,
  "health": {
    "status": "ok",
    "components": {
      "recording": { "healthy": true, "details": "" },
      "disk_space": { "healthy": true, "details": "82% free" }
    }
  },
  "activity": "recording",
  "meta": {
    "resources": {
      "cpu_percent": 12.3,
      "memory_mb": 87.2,
      "num_threads": 4,
      "storage_used_gb": 142.7,
      "storage_total_gb": 476.9,
      "storage_percent": 30.0
    },
    "db_level": -45.2
  }
}

The meta.resources block is automatically populated by the ResourceCollector (part of SilvaService). Services that have a workspace_path also get storage metrics. Service-specific custom fields (e.g. db_level) are added by the service itself.

[!NOTE] The Controller additionally includes host-level metrics in its heartbeat under meta.host_resources (total CPU, total RAM, total disk). This enables the Web-Interface dashboard to show system-wide resource utilization in addition to per-service metrics.

{
  "service": "controller",
  "meta": {
    "resources": { "cpu_percent": 3.1, "memory_mb": 45.0, "num_threads": 2 },
    "host_resources": {
      "cpu_percent": 23.5,
      "cpu_count": 4,
      "memory_used_mb": 2048.0,
      "memory_total_mb": 8192.0,
      "memory_percent": 25.0,
      "storage_used_gb": 142.7,
      "storage_total_gb": 476.9,
      "storage_percent": 30.0
    }
  }
}

2.5. New Core Modules

Module Purpose Used By
silvasonic.core.service SilvaService base class — canonical lifecycle All services
silvasonic.core.heartbeat HeartbeatPublisher — async fire-and-forget Redis heartbeats All services
silvasonic.core.redis get_redis_connection() — best-effort connect, auto-reconnect All services

These extend the existing shared modules:

Module (existing) Purpose
silvasonic.core.health.HealthMonitor Thread-safe singleton for component status
silvasonic.core.health.start_health_server Background HTTP server on /healthy
silvasonic.core.logging.configure_logging Structured logging (Rich in dev, JSON in prod)
silvasonic.core.settings.DatabaseSettings Pydantic-based config from env vars

3. Options Considered

  • No base class (copy-paste lifecycle): Rejected. Already causing drift between controller and recorder implementations. Maintenance cost grows with each new service.
  • Framework-based approach (e.g., Nameko, FastStream): Rejected. Adds a heavy runtime dependency for a simple lifecycle pattern. Silvasonic services are not complex enough to warrant a framework.
  • Recorder without Redis (Controller proxies status): Rejected. Creates a non-uniform pattern. The Controller would need to poll Recorder health via HTTP and re-publish to Redis — adding latency, code, and a different status path for Tier 2 vs. Tier 1 services.

4. Consequences

  • Positive:
    • Uniform: Every service follows the same lifecycle, reports status the same way, and shuts down the same way.
    • Minimal code per service: New services only implement their domain logic — health, heartbeat, logging, and shutdown are inherited.
    • Testable: The SilvaService base class can be unit-tested once; individual services test only their overrides.
    • Redis dependency is lightweight: redis-py is ~60 KB pure Python. Fire-and-forget heartbeats add negligible overhead.
  • Negative:
    • redis-py becomes a transitive dependency for all services, including the Recorder.
    • Services cannot be tested in complete isolation from Redis (though the heartbeat silently degrades if Redis is unavailable).
    • The SilvaService abstraction must remain thin — feature creep in the base class would affect all services.