Redis feature store

Serve pre-computed ML features on the request path under tight latency budgets, with batch and streaming features kept fresh in the same store.

When to use Redis as a feature store

Use Redis as the online layer of a feature store when production models — fraud scoring, recommendations, dynamic pricing — need dozens of pre-computed features per prediction on every request, with sub-millisecond reads, mixed batch-and-streaming freshness, and high write throughput from concurrent ingestion pipelines.

Why the problem is hard

An online feature store has to serve dozens of features per inference call inside a request budget measured in milliseconds, while batch jobs and streaming pipelines update those same features at very different cadences. Some of the obvious workarounds have real drawbacks:

  • Querying the offline warehouse directly adds hundreds of milliseconds per inference call, which makes real-time serving impossible.
  • A bespoke cache in front of the warehouse solves latency but introduces training-serving skew: the features served at inference drift from what the model trained on, silently degrading accuracy whenever a transform changes on one side and not the other.
  • Disk-backed online stores hit a throughput wall when every user action has to update a dozen features simultaneously across millions of entities — the I/O mix of small concurrent writes is exactly what they are slowest at.
  • Single-TTL stores can't handle mixed staleness: batch features refreshed nightly coexist with streaming features updated every few seconds, and a single per-key expiry can't express both. Worse, a failed ingestion pipeline must expire its features rather than serve stale values silently.

A workable online feature store needs sub-millisecond reads at request rate, high concurrent write throughput from mixed batch and streaming ingestion, independent freshness controls per feature, and self-cleaning behavior when an upstream pipeline fails — without standing up a dedicated piece of infrastructure beside the rest of the model-serving stack.

What you can expect from a Redis solution

You can:

  • Serve feature vectors to inference endpoints under 1 ms P99 (99% of requests have a latency of 1 ms or less) at millions of reads per second from a single shard, and scale horizontally beyond that with Redis Cluster.
  • Run batch and streaming ingestion concurrently against the same entities without locking or version columns — Redis is single-threaded per shard, so individual field writes are atomic by construction.
  • Apply different freshness guarantees to individual features within the same entity hash: seconds for real-time signals, hours for batch aggregates, with per-field TTL via HEXPIRE.
  • Let stale streaming features self-expire when their ingestion pipeline fails, so models receive missing features rather than silently outdated ones.
  • Retrieve features for hundreds of entities in a single round trip for batch scoring, using pipelined HMGET.
  • Plug into Redis Feature Form — Redis's own materialize / serve layer — or Feast with a connection-string change, so no bespoke serving code is required.
  • Co-locate the online feature store on the same Redis instance already handling cache, sessions, or rate limiting in the stack — no additional infrastructure.

How Redis supports the solution

In practice, each entity (a user, an account, an item) is a single Hash at a deterministic key like fs:user:{id}. The hash holds every feature for that entity as one field per feature — batch-materialized aggregates alongside streaming-updated signals — so one HMGET call returns whatever subset the model needs in one round trip. A key-level EXPIRE aligns with the batch materialization cycle so a whole entity self-cleans when its pipeline stops refreshing it, and per-field HEXPIRE lets each streaming feature carry its own shorter expiry independent of the rest of the hash.

Redis provides the following features that make it a good fit for an online feature store:

  • Hashes group every feature for an entity under one key, so retrieval reads everything the model needs in a single network round trip with HMGET, and small hashes use listpack encoding for compact in-memory representation.
  • HSET writes any subset of fields atomically, so batch and streaming pipelines can update overlapping or disjoint features on the same entity concurrently without locks or version columns.
  • HEXPIRE and HTTL (Redis 7.4+) give per-field TTLs, so streaming features (5-minute freshness) and batch features (24-hour freshness) can live in the same hash with independent expiry — the mixed-staleness problem becomes a one-line server-side guarantee.
  • EXPIRE at the key level lets an entity disappear entirely if its batch refresher fails, so inference sees a missing entity (which the model handler can detect and fall back on) rather than silently outdated values.
  • Pipelining bundles HMGET calls for many entities into one round trip, which is the right primitive for batch scoring where the model needs features for hundreds of entities at once.
  • Sub-millisecond reads and writes from memory keep the feature store off the critical path of inference, so the model-server's request budget is spent on the model rather than on feature retrieval.

Ecosystem

The following libraries and platforms use Redis as their online feature store:

  • Redis Feature Form is Redis's own feature-engineering platform. It defines features, labels, and feature views in a Python definitions file, materializes them through a registered provider, and serves them from Redis as the low-latency online store. See the quickstart for an end-to-end walkthrough.
  • Python: Feast ships Redis as a first-class online store provider — point a Feast online_store block at a Redis connection string and the RedisOnlineStore backend handles materialization and serving.
  • Compute: Apache Spark batch jobs run the nightly materialization, writing into Redis via the Redis Feature Form / Feast materialize commands or directly with the spark-redis connector.
  • Streaming: Apache Flink or Kafka Streams compute the real-time features and HSET them into Redis with per-field HEXPIRE so each streaming signal carries its own freshness window.
  • Infrastructure: Kubernetes co-locates Redis pods alongside the model-serving containers, with horizontal-pod autoscaling on the read replicas to track inference load; Active-Active geo-distribution on Redis Enterprise / Redis Cloud replicates the online store across regions for low-latency reads close to each inference cluster.

Code examples to build your own Redis feature store

The following guides show how to build a small Redis-backed online feature store for a fraud-scoring model. Each guide includes a runnable interactive demo that lets you bulk-load batch features, run a streaming worker that updates real-time features with per-field TTL, retrieve any subset of features for a single user under 1 ms, and pipeline batch reads across a hundred users.

RATE THIS PAGE
Back to top ↑