Uploader¶
Status: planned - Not implemented · Tier: 2 · Instances: Multi-instance: one per remote storage target
[!WARNING] Drift Warning: This is a highly detailed TO-BE specification. Implementation details like
uploaderservice internals and the exact database orchestration are highly susceptible to drift once development begins.
TO-BE: Data exfiltration service responsible for compressing Raw recordings to FLAC and synchronizing them to remote storage providers. Ensures the field device never depends on network connectivity — recordings are safely stored locally first (Store & Forward).
1. The Problem / The Gap¶
- No Off-Device Backup: Without an Uploader, all recordings exist only on the local NVMe. Hardware failure, theft, or SD card corruption means permanent data loss.
- Storage Limits: Even with the Janitor managing retention, local NVMe capacity is finite. Exfiltrating data to the cloud frees local space for continued recording.
2. User Benefit¶
- Data Safety: Recordings are automatically archived to remote storage (Nextcloud, S3, SFTP).
- Indefinite Recording: Once a recording has been successfully uploaded to all currently active remotes, the Janitor may treat the local copy as deletable under the normal retention rules.
- Multi-Target: Different storage providers for different purposes (e.g., Nextcloud for sharing, S3 for long-term archive).
3. Core Responsibilities¶
Inputs¶
- Database (Polling): Reads
recordingstable to find files not yet present in theuploadstable for this specific remote target (per-remote audit log). - Filesystem (Read-Only): Resolves local source files by joining the mounted recorder workspace root with the relative path stored in
recordings.file_raw.sensor_idis used for identity and remote path building, not for reconstructing local filesystem paths. - Database (Config): Reads
storage_remotestable for connection parameters (endpoint, credentials, path).
Processing¶
- FLAC Compression: Converts Raw WAV to FLAC for bandwidth-efficient upload (lossless, ~50% size reduction).
- Upload: Transfers FLAC files to the configured remote storage provider.
- Audit Logging: Records upload attempts in the
uploadstable (immutable audit log).
Outputs¶
- FLAC Files uploaded to remote storage.
- Database Rows: INSERTs into
uploadstable (success/failure, file size, errors). - Database Updates: Sets
uploaded=trueanduploaded_aton therecordingsrow ONLY after a confirmed upload to ALL currently active remotes.uploaded_atrecords the timestamp at which the recording first reached the “all currently active remotes uploaded successfully” state. Inactive remotes are ignored. This serves as the Janitor deletion signal.
[!CAUTION] Contract Test Required: The
uploaded=truelogic is extremely load-bearing because the Processor Janitor relies on it to safely delete files. Once implemented, this MUST be backed by a strong integration/contract test verifying that a file is never marked as uploaded if even one active remote target failed.
- Redis Heartbeats: Via
SilvaServicebase class (ADR-0019).
4. Operational Constraints & Rules¶
| Aspect | Value / Rule |
|---|---|
| Immutable | Yes — config at startup, restart to reconfigure (ADR-0019) |
| DB Access | Yes — reads recordings, storage_remotes; writes uploads |
| Concurrency | Async event loop, sequential uploads per instance |
| State | Stateless (runtime) — upload progress not persisted |
| Privileges | Rootless (ADR-0007) |
| Resources | Medium — FLAC encoding is CPU-intensive for large files |
| QoS Priority | oom_score_adj=250 — Low Priority, below recording priority (ADR-0020) |
[!NOTE] Each Uploader instance is managed by the Controller as a Tier 2 container. The Controller creates one Uploader per active
storage_remotesentry. If the number of active remotes exceedsmax_uploaders(configured insystem_config), the Controller schedules instances deterministically ordered bycreated_at ASC, slug ASC.
5. Configuration & Environment¶
Static Configuration (Environment Variables & Mounts)¶
| Variable / Mount | Description | Default / Example |
|---|---|---|
| Health port | Internal health endpoint | 9500 |
SILVASONIC_STORAGE_REMOTE_SLUG |
Identifier for the storage target (DB lookup key) | nextcloud-main |
SILVASONIC_REDIS_URL |
Redis connection URL | redis://redis:6379/0 |
SILVASONIC_INSTANCE_ID |
Instance identifier for heartbeats | nextcloud-main |
POSTGRES_HOST, POSTGRES_PORT, etc. |
DB connection (reads storage_remotes, recordings, system_config) |
(from Controller injection) |
Recorder workspace :ro |
Recorder workspace (read-only, Consumer Principle, ADR-0009) | (bind mount) |
[!NOTE] Remote type, endpoint, and credentials are stored in the
storage_remotes.configJSONB column and read from the database on startup — not via environment variables. The Uploader generates a temporaryrclone.confat runtime from this data (see Milestone v0.6.0 Phase 4).
Dynamic Configuration (Database)¶
Runtime-tunable settings stored in the system_config table under key uploader. As an Immutable Container (ADR-0019), the Uploader reads these settings once on startup.
| Setting | Description | Default / Example |
|---|---|---|
enabled |
Global toggle for upload activity | true |
poll_interval |
Seconds between checking DB for work | 30 |
bandwidth_limit |
Rclone --bwlimit parameter String |
"1M" |
schedule_start_hour |
Opt-in: hour to start daily upload window | null (24/7) |
schedule_end_hour |
Opt-in: hour to end daily upload window | null (24/7) |
Update Mechanism (State Reconciliation):
1. User changes settings in Web UI.
2. Frontend updates system_config in DB and publishes a silvasonic:nudge event to the Controller (per ADR-0017).
3. The Controller restarts the Uploader container(s).
4. The Uploader reads the new UploaderSettings from the database upon startup.
6. Technology Stack¶
- Base Image:
python:3.13-slim-bookworm(with rclone installed) - FLAC Encoding:
ffmpeg(via Pythonsubprocess). Highly optimized, stable for large files, and streams data without blowing up container memory. - Upload Protocols:
rclone(system binary via Python wrapper). Serves as a universal backend for WebDAV, S3, SFTP, and dozens of other protocols without needing protocol-specific Python libraries. - Database:
sqlalchemy(2.0+ async),asyncpg
7. Open Questions & Future Ideas¶
- Parallel uploads: Multiple files concurrently for faster throughput.
- Resume support: Track partial uploads to avoid re-uploading on interruption.
- Bandwidth throttling: Avoid saturating the network link during peak hours.
8. Out of Scope¶
- Does NOT record audio (Recorder's job).
- Does NOT delete local files (Janitor's job in Processor).
- Does NOT analyze audio (BirdNET / BatDetect's job).
- Does NOT manage its own lifecycle (Controller starts/stops Uploader instances).
9. References¶
- Database Schema (DDL) — authoritative definition of
uploadsandstorage_remotestables - ADR-0011 — Raw → FLAC for cloud, Retention Policy
- ADR-0013 — Tier 2 lifecycle management
- ADR-0019 — Immutable Container, SilvaService
- ADR-0023 — Configuration Management (upload settings)
- Glossary: Uploader, Store & Forward, Raw Artifact, Storage Remote
- ROADMAP.md — milestone (v0.6.0)