Quick facts

Profile
No-ZK, single-node ClickHouse
State
Functional; cosmetic OOS in Argo pending an ignoreDifferences rule

What it is

Used for traces and metrics from in-cluster workloads. The cosmetic OutOfSync in Argo CD is the K8s API-server adding default fields (schedulerName, dnsPolicy, terminationGracePeriodSeconds, etc.) that aren't in the source manifests; functionally the install is healthy.

Architecture

Per-cluster install. Components: frontend (the SigNoz web UI, Apache nginx serving the React SPA), query-service (Go backend reading from ClickHouse), otel-collector + otel-collector-metrics (OpenTelemetry receivers + processors writing into ClickHouse), and a single-node clickhouse StatefulSet on Longhorn.

The "no ZK" profile drops Zookeeper that the older SigNoz layouts required for ClickHouse coordination — single-node ClickHouse doesn't need it. We had a previous deploy with ZK that left orphan ConfigMaps; Argo's diff against those is the source of the cosmetic OutOfSync (MR #7-#8).

Configuration

Source: clusters/<cluster>/manifests/signoz/ — raw manifests adapted from the SigNoz Helm chart values, simplified for the lab's single-node ClickHouse.

Receivers exposed: OTLP gRPC (4317), OTLP HTTP (4318), Jaeger Thrift (14268). Apps in the cluster send traces and metrics by setting OTEL_EXPORTER_OTLP_ENDPOINT at http://otel-collector.signoz:4317.

No OIDC integration today — the SigNoz UI uses its own user database. Auth-via-Keycloak is on the wishlist.

Operations

Failover

Per-cluster — no cross-cluster trace/metric replication. If DC dies, the trace history living in DC's ClickHouse goes with it (until VM-level recovery). DR's SigNoz only has traces from DR-side workloads.

No edge HAProxy backend wired. Plan-09 candidate.

References