How to Use Redis with Predefined Queries

Store data on Redis and get with predefined queries in C# with StackExchange.Redis

What is Redis

From redis.io

Redis(REmote DIctionary Server) is an open-source (BSD licensed), in-memory data structure store, used as a database, cache, and message broker. It supports data structures such as strings, hashes, lists, sets, sorted sets with range queries, bitmaps, hyperloglogs, geospatial indexes with radius queries and streams.

More information about data types can be found here: https://redis.io/topics/data-types

Redis realized the system that can store data and cache at the same time. It uses the design that data is always modified and read from memory and store the data on disk. The data on the disk is not suitable for random access, so it uses data on disk for reconstructing back in memory once the system restarts. The data model which is provided by Redis is very unusual when you compare it with traditional relational database management systems (RDBMS). Therefore commands are not queries and they are just operations to store and get data from Redis. Due to the nature of storing data in Redis and design, writing/getting data operations are very fast.

Main Advantages

  • High performance. It can handle more than 120,000 requests per second.
  • Easy to use. Data can be stored with a simple SET command and can be retrieved using a GET command.
  • High availability. It supports non-blocking master/slave replication to guarantee high availability of data.
  • Language support

Problem

Since Redis is basically a dictionary, we can not query on it. Imagine that we have an entity and want to retrieve if IsActive property is true. We can store all the entities in the hashes. Unfourtanetly, due to the nature of data in Redis, we can not query active entities in theses hashes. Achieve this requirement in Redis, we need to maintain a separate set which holds all the entity keys meet with the requirement. Every time changed of IsActive of an entity, we need to take care of add or remove related entity key in the set. Another example that, we want to sort entities by time and get by time range. To do that, again we need to maintain a separate sorted set that has a score as time ticks.

We can define the problem as, it is not possible to query Redis and we need to maintain separate sets, sorted sets, lists, etc to achieve store related data which meet with criteria when we need to retrieve data.

Intention

To use Redis more efficiently (not just as a remote dictionary), we need a mechanism to retrieve data regarding our criteria (queries). It is possible to get data that meets our queries by maintaining several hash sets, sets, and sorted sets, etc. However, maintain several things for each query and entity finally ended up a mess. We need a more general and reusable way to do this and a simpler solution.

Solution

The most straight forward and a basic solution is that define queries as pipelines that are executed before each writes, update, delete operation to Redis. When we want to retrieve data, we just need to get related ids that are stored in a set or sorted set and combine with hashes that store the actual data. Let we have a Stock Entity:

public class Stock {
public string Name { get; set; }
public string Symbol { get; set; }
public StockSector Sector { get; set; }
public double Price { get; set; }
public double PriceChangeRate { get; set; }
public DateTime CreatedDateTime { get; set; }
public bool IsActive { get; set; }

public Stock() {
Id = Guid.NewGuid().ToString();
}
}
public enum StockSector {
None,
Technology,
Agriculture,
Energy,
Insurance
}

And we want to group by StockSector. So, we just need to store the whole stock entity in a hash and five set to store Stock ids by sector.

For example, we can use Stock:{Id} as key template for our entities and Stock:groupBy:StockSector:{value} as key template for our sets. When we need to get entities for Technology sector, first we need to get ids from Stock:groupBy:StockSector:Technology set and get the actual data by iterating over Stock:{Id} with desired keys. When we write data to Redis, we need to maintain a related set. For example, if we update the sector for an entity, we need to remove the id from the old sector set and add it to the new one.

As you see, the flow is pretty straightforward and simple. We just need to generalize the process and make it reusable. To achieve this, we can introduce a new thing: BRANCH! With each grouping, we just divide entities into branches.

Photo by Zach Reiner on Unsplash

A typical Branch interface could be seen below:

public interface IBranch<T> {
IBranch<T> GroupBy(string propName);
IBranch<T> GroupBy(string functionName, Expression<Func<T, string>> groupFunction);
void SortBy(string propName);
void SortBy(string functionName, Expression<Func<T, double>> sortFunction);
string GetBranchKey();
string GetBranchKey(T entity);
string GetBranchKey(params string[] values);
bool IsSortable();

}

We can group entities by properties and their values also we can do a bit more. A little bit complicated work is done by GroupBy function method. Also, we have the same mentality for SortBy functionality. We can sort by property value and given function which returns score value. Also, we have several GetBranchKey functions that are returned Redis key-value for branch and given values such as StockSector value, etc. The difference between GroupBy and SortBy functions is the return type. GruopBy functions return Branch self. In this way, we can use fluent API like branch.GroupBy(“IsActive”).GroupBy(“Sector”) and so on. However, SortBy returns nothing, because a branch may have only one or none sort statement. When we use SortBy actually we write related entity ids to sorted set with a given score. The next thing is a structure that holds and manages all branches for an entity and also manages connection to Redis. And here it is IRedisRepository:

public interface IRedisRepository<T> {
Task AddAsync(T entity);
Task UpdateAsync(T entity);
Task<bool> DeleteAsync(T entity);
Task<bool> DeleteAsync(string id);
Task<T> GetByIdAsync(string id);
Task<IEnumerable<T>> GetAsync(string branchId, params string[] groups);
Task<IEnumerable<T>> GetAsync(string branchId, double from, params string[] groups);
Task<IEnumerable<T>> GetAsync(string branchId, double from, double to, params string[] groups);
Task<IEnumerable<T>> GetAsync(string branchId, double from, double to, long skip, long take, params string[] groups);
Task<long> CountAsync(string branchId, params string[] groups);
Task<long> CountAsync(string branchId, double from, params string[] groups);
Task<long> CountAsync(string branchId, double from, double to, params string[] groups);
void AddBranch(IBranch<T> branch);
...
}

RedisRepository just contains branches for an entity and responsible to update related hash, sets and sorted sets regarding to operation. Typical implementation for add method can be like that:

public class StockRepository<Stock> : IRedisRepository<Stock> {
...
public async Task AddAsync(Stock entity) {
await _redisDatabase.HashSetAsync(entity);
await UpdateBranchesAsync(entity, State.Added);
}
private Task UpdateBranchesAsync(Stock entity, State state) {
switch (state) {
case State.Added:
case State.Updated:
foreach (var branch in _branches) {
string key = branch.GetBranchKey(entity);
if (branch.ApplyFilters(entity)) {
if (branch.IsSortable()) {
await _redisDatabase.SortedSetAddAsync(key, entity.Id, branch.GetScore(entity));
}
else {
await _redisDatabase.SetAddAsync(key, entity.Id);
}
}
else {
if (branch.IsSortable()) {
await _redisDatabase.SortedSetRemoveAsync(key, entity.Id);
}
else {
await _redisDatabase.SetRemoveAsync(key, entity.Id);
}
}
}
break;
case RedisEntityStateEnum.Deleted:
foreach (var branch in _branches) {
string key = branch.GetBranchKey(entity);
if (branch.IsSortable()) {
await _redisDatabase.SortedSetRemoveAsync(key, entity.Id);
}
else {
await _redisDatabase.SetRemoveAsync(key, entity.Id);
}
}
break;
default:
break;
}
}
...
}

With such an Update Branches method we can maintain which entities go which branches or not. It is actually implementation of the flow we mentioned before. So, now we need to have a Redis client. We can use StackExchange.Redis here for a client.

Drawbacks

  • It is a simple solution. This means joining entities like T-SQL is not possible.
  • What happens to old data when you add a new branch after starting your app? The new entities from the moment you added a new branch, are only filtered by branches. You need to take care of old ones, you need the get and update the old ones to take effect newly added branch.

StackExchange.Redis.Branch

Fortunately, all the heavy work such a task has already be done. You can use StackExchange.Redis.Branch package to query Redis. This package uses StackExchange.Redis as a client for Redis. There is another good news is that the package is open source and already has unit and integration tests.

https://www.nuget.org/packages/StackExchange.Redis.Branch/

Software Engineer | Love learning new things and share knowledge