Redis (RedisFailover) — BRAC POC tools

Overview

The platform's KV store. Each cluster runs an independent Redis HA pair (one primary, two replicas) fronted by three Sentinels for in-cluster failover. DC and DR are not directly replicated at the Redis layer — instead, Kafka acts as a write-ahead log and the redis-applier replays the WAL into the local Redis primary on each side. The result is two eventually-consistent copies of the same dataset, decoupled at the storage layer but coupled at the streaming layer.

This pattern (ADR-0018) was chosen because Redis-native replication across heterogeneous regions is fragile (split-brain, cascading auth, network blips), but Kafka MM2 between two RKE2 clusters is well-understood and already needed for other replication paths.

Operator: Spotahome redis-operator 3.2.13 reconciling a RedisFailover CR
Topology per cluster: 3 Redis pods (1 primary + 2 replicas) + 3 Sentinel pods
Persistence: Longhorn-backed PVCs (one per Redis pod)
Failover (within cluster): Sentinel quorum-driven primary election
Failover (DC ⟷ DR): Independent instances; no Redis-native replication. Convergence via Kafka WAL (ADR-0018) replayed by redis-applier on each side.
Cross-cluster carrier: Kafka topic redis-writes, mirrored DC→DR by MirrorMaker 2
Validation: End-to-end DR test PASSED 2026-05-05 — write to DC Redis appeared on DR Redis after the MM2 + applier hop

The Redis layer (per cluster)

Topology

Per-cluster, the Redis layout is the standard Spotahome RedisFailover arrangement: one primary, two replicas, three Sentinels watching them, plus the local redis-applier consumer (covered in the next section).

┌── one cluster (DC or DR) ────────────────────────────────┐
│                                                          │
│   ┌──────────┐   ┌──────────┐   ┌──────────┐             │
│   │ sentinel │   │ sentinel │   │ sentinel │             │
│   │   pod    │   │   pod    │   │   pod    │   quorum=2  │
│   └─────┬────┘   └─────┬────┘   └─────┬────┘             │
│         └──────────────┼──────────────┘                  │
│                        │ monitors                        │
│         ┌──────────────┼──────────────┐                  │
│   ┌─────▼────┐   ┌─────▼────┐   ┌─────▼────┐             │
│   │  redis   │   │  redis   │   │  redis   │             │
│   │ primary  │   │ replica  │   │ replica  │  PVC each   │
│   └──────────┘   └──────────┘   └──────────┘             │
│                                                          │
│   ┌────────────────┐                                     │
│   │ redis-applier  │  ← consumes redis-writes from local │
│   │   (3 pods)     │     Kafka and writes to primary     │
│   └────────────────┘                                     │
│                                                          │
└──────────────────────────────────────────────────────────┘

Setup & configuration

Deployed via Argo CD. Each cluster's app definition under clusters/<dc|dr>/apps/redis/ renders a RedisFailover CR that the Spotahome operator reconciles.

Storage: Longhorn StorageClass; one PVC per Redis pod. Sentinels are stateless.
Resources: Sized for POC traffic; no eviction policy enforced beyond the operator default. maxmemory is left to the OS / PVC capacity for now (see Improvements).
AOF / RDB: Default operator settings. AOF on the primary catches recent writes that have arrived via redis-applier but haven't yet been replicated to replicas.
Auth: No password is enforced inside the namespace today (POC scope). Network reachability is constrained by NetworkPolicies and the cluster ingress posture. Production-equivalent would put a Sentinel-known password in a Secret and reference it in the RedisFailover CR.
Cross-namespace plumbing: redis-applier lives in the redis namespace; it consumes from Kafka in the kafka namespace via SCRAM. Two secrets are bridged across namespaces today (kafka-cluster-ca-cert, redis-applier SCRAM creds) — imperative copy at bootstrap time, ExternalSecrets via Vault is on the roadmap.

In-cluster failover

Sentinel quorum (2 of 3) elects a new primary in seconds. Sentinel-aware clients receive a +switch-master notification, re-resolve, and continue. The redis-applier reconnects via Sentinel and resumes consuming from its committed offset, so no Kafka messages are lost during a primary swap.

Day-to-day operations

Inspect cluster state:

kubectl -n redis get redisfailover
kubectl -n redis get pods -l app.kubernetes.io/component in (redis, sentinel)
# Find the current primary:
kubectl -n redis exec -it rfs-redis-0 -- redis-cli -p 26379 sentinel get-master-addr-by-name mymaster

Inspect applier consumer lag:

kubectl -n kafka exec -it kafka-broker-0 -- /opt/kafka/bin/kafka-consumer-groups.sh \
  --bootstrap-server localhost:9092 \
  --describe --group redis-applier

Backup: Longhorn snapshots cover the AOF/RDB on disk. Logical backup via BGSAVE + Longhorn snapshot is the recommended belt-and-braces approach. There is no scheduled cronjob for Redis backups today.
Restore: The authoritative source is the Kafka WAL. To restore Redis from scratch, wipe the PVC, let the operator recreate the pods, and replay the topic from --reset-offsets --to-earliest on the redis-applier consumer group. Retention on the topic governs how far back you can restore — see Improvements.
Dashboards: RedisInsight (tool page) for ad-hoc inspection. Per-pod metrics will flow into SigNoz once the operator's metrics exporter is wired (currently not).

Cross-cluster: the Kafka WAL pattern

Flow

Across clusters, the Kafka WAL stitches the two independent Redis instances into one eventually-consistent dataset. App writes go into Kafka rather than directly into Redis; on each side, the local redis-applier consumes the same topic and materialises it into the local Redis primary.

      ┌──── DC cluster ────┐                ┌──── DR cluster ────┐
      │                    │                │                    │
      │  app  ─────write──▶│  Kafka         │                    │
      │                    │  redis-writes  │                    │
      │                    │  topic         │                    │
      │                    │     │          │                    │
      │  redis-applier ◀───┘     │          │                    │
      │     │                    │          │                    │
      │     ▼ apply              │          │                    │
      │  Redis primary           │          │                    │
      │     │                    │          │                    │
      │  Sentinels               │          │                    │
      │                          ▼          │                    │
      │                       MM2 ─────────▶│ Kafka              │
      │                                     │ redis-writes topic │
      │                                     │      │             │
      │                                     │      ▼             │
      │                                     │ redis-applier      │
      │                                     │      │             │
      │                                     │      ▼ apply       │
      │                                     │ Redis primary      │
      │                                     │  Sentinels         │
      └─────────────────────────────────────┴────────────────────┘

Two properties make this work:

Ordering is preserved per key. The topic is partitioned by Redis key, so all writes for the same key land on the same partition and are consumed and replayed in the same order on both clusters.
Each cluster is self-contained at read time. The Sentinels and Redis pods on one side never talk to the other side. A network partition between DC and DR doesn't affect read availability on either side; it only stretches the replication lag.

How redis-applier consumes Kafka

The redis-applier is a small Go consumer (16.9 MB distroless, v0.1.1) running 3 replicas per cluster. Its contract:

Consumes from local Kafka, topic redis-writes, consumer group redis-applier.
Authenticates via SCRAM-SHA-512 as KafkaUser redis-applier on the internal listener (scram, port 9095, TLS).
Decodes each message into a Redis op (key, command, args, optional TTL).
Applies the op against the local Redis primary (resolved via Sentinel).
Reports ready on /readyz only after the consumer group has joined and partitions have been assigned (the v0.1.1 fix — earlier v0.1.0 only flipped readyz after the first message, which broke rolling restarts on quiet topics).

On DR, MirrorMaker 2 mirrors redis-writes from DC Kafka into DR Kafka with the original partition keys preserved. The DR redis-applier sees the same messages in the same per-key order and produces the same Redis state, modulo replication lag. End-to-end was validated on 2026-05-05 (11:21 UTC).

DC ⟷ DR cutover semantics

Today there is no edge HAProxy frontend for Redis itself, so DC/DR cutover for Redis is implicit: the application's Kafka client (which is failover-aware via the kafka-rke2-be backup backend) starts producing into DR Kafka when DC drops. The DR redis-applier — which has been steadily replaying the mirrored topic the whole time — keeps writing into DR Redis. The application's read path needs to switch to DR Redis at the same time; this is currently the application's responsibility.

Lag during a cutover is the sum of:

Producer round-trip when DC drops (one TCP retry to DR Kafka).
MM2 lag on messages already in DC Kafka that hadn't yet replicated (typically sub-second; bounded by MM2 commit interval).
DR redis-applier consumer lag on its local topic (typically near-zero in steady state).

Client guidance

There are two distinct client roles. Pick the right one for what you're building.

1. Application writes (durable, replicated)

Don't write to Redis directly from the application. Instead, produce a structured Redis op message into Kafka topic redis-writes. The redis-applier on the local side will replay it into Redis; MM2 + the DR applier will replay it on DR.

Recommended client setup:

Bootstrap: bootstrap.kafka.apps.sub.comptech-lab.com:443 (edge-HAProxy SNI-routed, DC-primary + DR-backup).
Security: SASL_SSL + SCRAM-SHA-512, identity scoped per app (e.g. jboss-client with shared SCRAM password across DC/DR; pattern documented on the Kafka page).
Topic: redis-writes. Partition key = Redis key (so all ops for one key are ordered).
Truststore: the combined PEM at ~/cloud-init/kafka-client/kafka-combined-ca.pem covers both cluster CAs, so post-failover TLS still validates.

2. Application reads (low-latency, local)

Reads go directly to the local Redis primary, via Sentinel discovery. Use a Sentinel-aware client; do not hard-code a Redis pod address.

Sentinel service (in-cluster): rfs-redis.redis.svc.cluster.local:26379
Master name: the value of spec.sentinel.customConfig's monitor entry on the RedisFailover CR (Spotahome default is mymaster if not overridden — confirm with kubectl -n redis get redisfailover redis -o yaml).
Reconnect on Sentinel notification: Sentinel pushes +switch-master events; the client must subscribe and re-resolve. All mainstream Sentinel clients (Lettuce, Jedis, Redisson, ioredis, redis-py) handle this automatically when configured for Sentinel mode.

Java / JBoss example (Lettuce, Sentinel)

RedisURI uri = RedisURI.Builder
    .sentinel("rfs-redis.redis.svc.cluster.local", 26379, "mymaster")
    .withTimeout(Duration.ofSeconds(2))
    .build();

RedisClient client = RedisClient.create(uri);
StatefulRedisConnection<String,String> conn = client.connect();
RedisCommands<String,String> redis = conn.sync();

// Reads only — writes go through Kafka redis-writes
String v = redis.get("orders:42");

Off-cluster clients

Redis is not currently exposed via the edge HAProxy. Off-cluster reads aren't supported today — if you need them, the right path is a TCP backend on HAProxy with DC-primary/DR-backup and a Sentinel-aware client pointing at the public hostname. Until that's wired, off-cluster traffic should go via an in-cluster service (HTTP API, gateway, etc.) that fronts Redis.

Evaluation

Strengths

Decoupled durability. Redis can lose any pod (or its whole PVC) and the data is recoverable from Kafka — no separate Redis backup tier needed.
Standard Redis on the read path. Any Sentinel-aware client library works. Apps don't need a custom SDK.
Clean DC/DR semantics. Cross-cluster replication is the same Kafka mechanism we use elsewhere — no Redis-specific replication topology, no split-brain to reason about.
Audit log for free. Every Redis-mutating op is in Kafka with retention, partitioned and ordered per key. Replay, debug, or fork into a separate consumer at will.
Per-cluster blast radius. A bad apply on DC doesn't immediately propagate to DR — there's a built-in delay (and the option to pause MM2) where a runaway can be contained.
Operator-driven lifecycle. Spotahome handles primary election, rolling restarts, replica catch-up. We don't run Sentinel by hand.

Weaknesses & known limitations

Eventual consistency. DR Redis is always slightly behind DC. Read-after-write only holds if the read goes to the same cluster the write went through.
Application contract is load-bearing. Every Redis-mutating call must produce a well-formed message to the topic. A buggy producer that writes directly to Redis (bypassing Kafka) creates a divergence that will not be repaired by replay.
No transactional cross-key ops. Redis MULTI/EXEC across keys isn't preserved if the keys hash to different partitions — they'll arrive on different consumer threads and the atomicity is lost. Either keep transactional sets under a single key, or pin them to a single partition manually.
TTL semantics need care. EXPIRE replays as relative time, so a TTL set in DC at T and replayed in DR at T+lag has a slightly different absolute expiry. Long-lived TTLs are fine; short ones can drift.
No schema enforcement on the topic. Today the message format is whatever producers agree to. A bad producer can push malformed messages and stall the applier on the dead-letter case.
No Redis password. POC scope. Anyone with NetworkPolicy access to the namespace has full Redis access.
No off-cluster Redis exposure. No edge HAProxy backend for Sentinel/Redis. External readers have to go through an in-cluster service that fronts Redis.
No metrics → SigNoz pipeline yet. Operator metrics exporter is available but not wired into SigNoz, so applier lag and Redis hit-ratio aren't on a dashboard.
Replay capacity is finite. Topic retention bounds how far back you can rebuild from scratch. If retention is shorter than your worst-case data loss window, you have a gap.
Cross-namespace secret bridging is imperative. The applier's SCRAM creds + Kafka cluster CA are copied from kafka to redis by hand at bootstrap; ExternalSecrets via Vault is queued in Phase R-G.

How it should be improved

Schema registry on redis-writes. Avro or Protobuf with a Schema Registry (Apicurio or Confluent SR), so producers and the applier agree on the wire format and bad messages are rejected at produce time.
Idempotency keys. Add a per-op idempotency token to the message. The applier dedupes on it, so a re-delivery (after consumer rebalance) doesn't double-apply non-idempotent ops like INCR.
Compacted "latest state" topic alongside the WAL. A second topic, log-compacted, holds the latest value per key. Bootstrapping a fresh Redis from scratch reads the compacted topic instead of replaying the entire WAL.
Lag SLO + alerts. Wire the consumer-group lag (against the topic high-water-mark) into SigNoz and define an SLO: e.g., "DR applier lag < 5 s for 99.9 % of one-minute buckets". Alert on breach.
Apply-side metrics. Expose ops/sec, error rate, parse failures, Redis OOM events from the applier as Prometheus metrics; ship into SigNoz.
Authenticated Redis. Move to RequirePass + ACL via Secret reference; tighten NetworkPolicies to the explicit set of namespaces that need Redis.
Off-cluster client path. Add a TCP backend on the edge HAProxy for Sentinel + Redis with DC-primary/DR-backup, so off-cluster Java/JBoss apps can read directly when needed.
ExternalSecrets via Vault. Replace the hand-bridged kafka-cluster-ca-cert + redis-applier secrets with ExternalSecrets sourced from Vault (Phase R-G).
TTL replication semantics. Standardise on PEXPIREAT at message-time + applier clock skew correction, instead of relative EXPIRE on apply, so absolute expiries match across clusters.
Cross-key transaction support. If business needs cross-key atomicity, encode the transaction as a single message (e.g., a "Lua script + args" envelope) so the applier runs it atomically against Redis.
Active/active. If DR ever needs to accept writes concurrently with DC, the WAL pattern needs CRDTs or per-key ownership to avoid last-writer-wins surprises. Active/standby is the only safe topology today.

References

Spotahome redis-operator on GitHub (RedisFailover CR docs).
ADR-0018 — Redis-via-Kafka WAL (in ~/cloud-init/adr/).
Cross-references in this catalogue:
- Redis operator (Spotahome) — the operator that reconciles the RedisFailover CR.
- redis-applier — the Kafka → Redis consumer.
- Apache Kafka (KRaft) — the WAL substrate.
- MirrorMaker 2 — the DC → DR carrier for redis-writes.
- RedisInsight — admin UI.