Name: Redis Cluster
Author: srijan-at-qwertystars

Redis Cluster | Skills Pool

Aspect	Sentinel	Cluster
Sharding	None — single dataset	Hash-slot partitioning across N masters
Scaling	Vertical only (read replicas)	Horizontal — add/remove shards live
Failover	External Sentinel process promotes replica	Built-in — replica auto-promoted per slot group
Multi-key ops	Unrestricted	Same-slot only (use hash tags)
Client requirement	Standard client	Cluster-aware client required
Use when	Dataset fits one node, <100 GB	Dataset exceeds one node, high throughput needed

port 7000
cluster-enabled yes
cluster-config-file nodes-7000.conf
cluster-node-timeout 5000
appendonly yes
# Require all slots covered to serve traffic (default yes):
cluster-require-full-coverage yes
# Allow replicas to serve stale reads during failover:
replica-serve-stale-data yes
# Cluster bus port (auto = port + 10000):
# cluster-port 17000

# 6 nodes: 3 masters + 3 replicas (--cluster-replicas 1)
redis-cli --cluster create \
  192.168.1.1:7000 192.168.1.2:7000 192.168.1.3:7000 \
  192.168.1.4:7000 192.168.1.5:7000 192.168.1.6:7000 \
  --cluster-replicas 1

>>> Performing hash slots allocation on 6 nodes...
Master[0] -> Slots 0 - 5460
Master[1] -> Slots 5461 - 10922
Master[2] -> Slots 10923 - 16383
...
[OK] All 16384 slots covered.

redis-cli -c -p 7000 CLUSTER INFO
# cluster_state:ok
# cluster_slots_assigned:16384
# cluster_slots_ok:16384
# cluster_known_nodes:6
# cluster_size:3

redis-cli -c -p 7000 CLUSTER NODES
# <id> 192.168.1.1:7000@17000 myself,master - 0 0 1 connected 0-5460
# <id> 192.168.1.4:7000@17000 slave <master-id> 0 0 1 connected
# ...

# Connect to the REPLICA you want to promote:
redis-cli -p 7001 CLUSTER FAILOVER
# OK
# The replica becomes master; old master becomes replica.

# These keys all hash to the same slot (hashing "user:1000"):
SET {user:1000}:name "Alice"
SET {user:1000}:email "[email protected]"
SET {user:1000}:prefs '{"theme":"dark"}'

# Multi-key operation succeeds because same slot:
MGET {user:1000}:name {user:1000}:email
# 1) "Alice"
# 2) "[email protected]"

# WITHOUT hash tags — FAILS:
MGET user:1000:name user:1000:email
# (error) CROSSSLOT Keys in request don't hash to the same slot

# Lua script — all keys must share a slot:
EVAL "return redis.call('GET', KEYS[1]) .. redis.call('GET', KEYS[2])" \
  2 {user:1}:name {user:1}:email
# "[email protected]"

# Add as empty master:
redis-cli --cluster add-node 192.168.1.7:7000 192.168.1.1:7000

# Add as replica of a specific master:
redis-cli --cluster add-node 192.168.1.8:7000 192.168.1.1:7000 \
  --cluster-slave --cluster-master-id <master-node-id>

# Move 1000 slots from source to new node:
redis-cli --cluster reshard 192.168.1.1:7000 \
  --cluster-from <source-node-id> \
  --cluster-to <target-node-id> \
  --cluster-slots 1000 \
  --cluster-yes

# Or rebalance automatically across all masters:
redis-cli --cluster rebalance 192.168.1.1:7000 --cluster-use-empty-masters

redis-cli --cluster fix 192.168.1.1:7000
# Repairs open slots, stuck migrations, uncovered slots.

> GET mykey
(error) MOVED 3999 192.168.1.2:7000
# Client updates: slot 3999 → 192.168.1.2:7000, retries GET mykey there.

> GET mykey
(error) ASK 3999 192.168.1.3:7000
# Client: connect to 192.168.1.3:7000, send ASKING, then GET mykey.

from redis.cluster import RedisCluster

rc = RedisCluster(
    startup_nodes=[{"host": "192.168.1.1", "port": 7000}],
    decode_responses=True,
    skip_full_coverage_check=False,
    retry_on_timeout=True,
)
rc.set("{user:1}:name", "Alice")
print(rc.get("{user:1}:name"))  # "Alice"

const Redis = require("ioredis");
const cluster = new Redis.Cluster([
  { host: "192.168.1.1", port: 7000 },
  { host: "192.168.1.2", port: 7000 },
], {
  redisOptions: { password: "secret" },
  scaleReads: "slave",          // read from replicas
  natMap: {},                    // for NAT/Docker port mapping
  retryDelayOnFailover: 300,
  retryDelayOnClusterDown: 1000,
});

Set<HostAndPort> nodes = new HashSet<>();
nodes.add(new HostAndPort("192.168.1.1", 7000));
nodes.add(new HostAndPort("192.168.1.2", 7000));
JedisCluster jc = new JedisCluster(nodes, 5000, 5000, 3, "password",
    new GenericObjectPoolConfig<>());
jc.set("{user:1}:name", "Alice");

save 900 1       # snapshot if ≥1 write in 900s
save 300 10      # snapshot if ≥10 writes in 300s
save 60 10000    # snapshot if ≥10000 writes in 60s
dbfilename dump-7000.rdb
dir /var/lib/redis/7000/

appendonly yes
appendfilename "appendonly-7000.aof"
appendfsync everysec          # balance durability/performance
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
aof-use-rdb-preamble yes      # hybrid AOF for faster restart

# Cluster health:
redis-cli -c -p 7000 CLUSTER INFO

# Node topology:
redis-cli -c -p 7000 CLUSTER NODES

# Slot distribution:
redis-cli --cluster check 192.168.1.1:7000

# Per-node memory:
redis-cli -p 7000 INFO memory
# used_memory_human:1.2G
# maxmemory_human:4G

# Slow queries:
redis-cli -p 7000 SLOWLOG GET 10

# Client connections per node:
redis-cli -p 7000 INFO clients
# connected_clients:142

# Keyspace stats:
redis-cli -p 7000 INFO keyspace
# db0:keys=1543210,expires=320100,avg_ttl=86400000

Metric	Source	Alert threshold
`cluster_state`	CLUSTER INFO	!= ok
`cluster_slots_fail`	CLUSTER INFO	> 0
`connected_clients`	INFO clients	> 80% of maxclients
`used_memory`	INFO memory	> 80% of maxmemory
`instantaneous_ops_per_sec`	INFO stats	deviation > 50% from baseline
`rejected_connections`	INFO stats	> 0
`master_link_status`	INFO replication (replicas)	!= up
Replication offset lag	CLUSTER NODES	replica offset diverging

redis-cli -p 7000 --latency-history -i 5
# min: 0, max: 3, avg: 0.45 (100 samples) -- 5.00 seconds range

redis-cli -p 7000 LATENCY LATEST
redis-cli -p 7000 LATENCY HISTORY event-name

redis-cli -p 7000 --bigkeys
# Biggest string: user:megalist — 45.2 MB
# Biggest hash: session:abc — 12340 fields

cluster-announce-ip 203.0.113.10
cluster-announce-port 7000
cluster-announce-bus-port 17000

Document	Path	Covers
Advanced Patterns	`references/advanced-patterns.md`	Cluster-aware Lua scripting, cross-slot transactions with hash tags, sharded pub/sub (Redis 7+), Streams in cluster mode, client-side caching with RESP3 tracking, cluster-aware connection pooling, standalone→cluster migration, ACL management
Troubleshooting	`references/troubleshooting.md`	Split-brain recovery, slot migration failures, node join/leave issues, memory fragmentation, redirect storms, cluster state inconsistency, replication buffer overflow, slow log analysis, latency diagnosis, network partition recovery
Operations Guide	`references/operations-guide.md`	Rolling upgrades, capacity planning, backup strategies (RDB/AOF), adding/removing nodes, rebalancing, monitoring with redis-cli, Prometheus/Grafana dashboards, alerting thresholds, maintenance windows

Script	Path	Purpose
Setup Cluster	`scripts/setup-cluster.sh`	Bootstrap a Redis cluster (Docker or bare-metal). Supports configurable masters/replicas, production mode with AUTH, and cleanup.
Health Check	`scripts/health-check.sh`	Comprehensive cluster health check: node states, slot coverage, replication, memory, latency, persistence. JSON output option.
Resharding	`scripts/resharding.sh`	Automated slot migration with progress tracking, batch control, dry-run mode, audit logging, and rollback capability.

# Bootstrap a 6-node dev cluster with Docker:
./scripts/setup-cluster.sh --mode docker --masters 3 --replicas 1

# Check cluster health:
./scripts/health-check.sh 127.0.0.1:7001 --verbose

# Migrate 1000 slots between nodes:
./scripts/resharding.sh --host 127.0.0.1:7001 \
  --from <source-id> --to <target-id> --slots 1000

# Clean up dev cluster:
./scripts/setup-cluster.sh --cleanup --mode docker

Asset	Path	Purpose
Docker Compose	`assets/docker-compose.yaml`	6-node Redis Cluster (3 masters + 3 replicas) with health checks, volumes, and auto-initialization.
Cluster Config	`assets/redis-cluster.conf`	Production `redis.conf` template with cluster, memory, persistence, security, replication, and performance settings.
Sentinel Config	`assets/sentinel.conf`	Production Sentinel configuration template. Note: Sentinel is for non-cluster HA only — do NOT combine with Redis Cluster.

# Quick start with Docker Compose:
cd assets/
docker compose up -d
# Cluster auto-initializes via the redis-cluster-init service.

# For bare-metal, copy and customize the config template:
cp assets/redis-cluster.conf /etc/redis/redis-7000.conf
# Edit: port, cluster-announce-ip, requirepass, maxmemory

Redis Cluster

Architecture

Hash Slot Assignment

Gossip Protocol

Minimum Topology

Redis Cluster

Architecture

Hash Slot Assignment

Gossip Protocol

Minimum Topology

Sentinel vs Cluster Mode

Setup and Configuration

Node Configuration (redis.conf)

Create Cluster

Verify

Replication and Failover

How Failover Works

Manual Failover

Replica Migration

Hash Tags for Multi-Key Operations

Hash Tag Rules

Lua Scripts and Transactions

Resharding and Scaling

Add a New Node

Reshard Slots

Remove a Node

Fix Broken State

Client-Side Configuration

MOVED Redirection

ASK Redirection

Client Library Configuration

Topology Refresh

Persistence in Cluster Mode

RDB Snapshots

AOF (Append Only File)

Monitoring and Diagnostics

Essential Commands

Key Metrics to Alert On

Latency Diagnostics

Common Pitfalls

1. Cross-Slot Errors

2. Hotspot Nodes

3. Large Keys (Big Keys)

4. Cluster Bus Port Blocked

5. Docker/NAT Issues

6. Full Coverage Requirement

7. Stale Client Topology

8. Memory Limit Without Eviction Policy

9. Unbalanced Slot Distribution

10. Replica Divergence During Network Partition

References

Scripts

Assets

Vector Index Tuning

Azure Resource Manager Redis Dotnet

Redis Expert

Elasticsearch

Cache Expert

Abp Mongodb