Redis Overview

Overview of Redis#

Basic Data Structures#

string
list
set
hash
zset

string#

Strings in Redis are mutable and exist in memory as byte arrays, with an internal structure similar to ArrayList in Java.
It uses pre-allocated redundant space to reduce memory allocation, but initially, no extra space is allocated when creating a string, as most of the time the append operation is not used to modify the string, thus avoiding memory reallocation due to string expansion.
During expansion, if the size is below 1M, it doubles the size; if above 1M, it only increases by 1M each time. The maximum length of a string is 512M.
There are two storage methods for strings: when the length is particularly short, it uses the emb(embstr) format; for longer lengths, it uses the raw format.

list#

Lists in Redis are a linked list structure, not an array, so insertion and deletion are faster, but indexing is slower, making element lookup relatively slow.
In earlier versions, when there were fewer list elements, a contiguous memory block was used for storage, which is the implementation of ziplist. When there are more elements, a regular doubly linked list (LinkedList) structure is used. However, due to the space consumption of additional pointers in linked lists, it was later replaced with the quicklist structure.

quicklist#

quicklist is a hybrid of ziplist and linkedlist, where linkedlist is segmented, and each segment uses ziplist for compact storage, with multiple ziplist connected by doubly pointers.

By default, a single ziplist in quicklist has a length of 8k bytes, determined by the configuration parameter list-max-ziplist-size. If the specified length is exceeded, a new one will be created.
To further save memory space, Redis also compresses ziplist storage using the LZF algorithm, and the actual compression depth can be determined by the configuration parameter list-compress-depth.

Common Operations:
The list structure in Redis is commonly used for asynchronous queues. Tasks that need to be delayed are serialized into strings and pushed into the Redis list, while another thread polls data from this list for processing.

> rpush
> lpop
> rpop

# The following operations are O(n) operations, use with caution
> lindex   # Get the value at the specified index
> lrange   # Iterate over the values in the specified index range
> ltrim books 1 -1  # Retain the values in the books list within the index range [1 to -1]

hash#

The hash implementation in Redis is similar to the HashMap implementation in Java, based on an array + linked list data structure. The differences are:

The values in Redis dictionaries can only be strings.
The conditions for expansion are different.
The rehash operation in Java's HashMap is done all at once. Redis uses a progressive rehash strategy for high performance, avoiding service blocking.

Expansion Implementation#

Expansion Conditions:
Generally, when the number of elements reaches the length of a one-dimensional array, expansion occurs, and the new array is twice the size of the original. However, if Redis is performing a bgsave (background asynchronous snapshot operation for RDB persistence), it tries to avoid expansion to reduce excessive separation of memory pages (due to the Copy On Write mechanism, detailed reference: COW! Understanding the Copy On Write Mechanism), but if the hash table is very full, reaching five times the array length, it will still force expansion.

Expansion Implementation:
Progressive rehash retains both the old and new hash structures during rehashing, querying both structures simultaneously. In subsequent scheduled tasks and hash operation commands, the contents of the old hash are gradually migrated to the new hash structure. Once the migration is complete, the new hash structure replaces the old one.

Common Operations:

> hset
> hmset # Batch set
> hget
> hgetall # entries(), get all key-value pairs at once

set#

Sets in Redis are similar to HashSet in Java, with an internal implementation akin to a special dictionary where all elements are stored as keys with empty values. The internal key-value pairs are unordered and unique.

Common Operations: sadd, spop, smembers, sunion, etc.
The set structure can be used to store user IDs that won activities, as it has deduplication functionality.

zset#

Similar to a combination of Java's SortedSet and HashMap, it is a set that ensures the uniqueness of internal values while allowing each value to have a score representing its sorting weight.
Implementation Methods:

ziplist: Used when the number of elements is less than 128 and each element's length is less than 64 bytes.
skiplist: If the above two conditions are not met, a skiplist is used, specifically a combination of map and skiplist map to store the mapping from member to score, allowing O(1) time to find the score corresponding to a member. The skiplist stores scores in ascending order, with each element's value being [score, value].

Common Operations: zadd, zrange, zrem, zcard, etc.

Skip List#

Linked list + multi-level index

Consideration: Why use skip lists instead of red-black trees (balanced binary trees)?

Both have the same time complexity, but skip lists are simpler to implement.
During concurrent insertions or deletions, balanced binary trees may require some rebalancing operations, which can affect other parts of the tree, while skip list operations are more localized, leading to slightly better performance.
In range queries using commands like ZRANGE, the bidirectional linked list in the skip list allows for easier manipulation. Additionally, there is cache locality, so the efficiency is not worse than that of balanced trees.

Reference Links:

Redis's Thread IO Model#

Redis is single-threaded. Although multi-threading has been introduced in newer versions, the multi-threaded part is only used for handling network data reading and writing and protocol parsing, improving network IO efficiency.

Why is Single-threaded Redis So Fast?#

Most operations are in-memory, which is very fast.
Being single-threaded avoids context switching and resource contention issues.
It uses non-blocking IO with multiplexing.

Reference Links#

Redis Persistence#

Redis provides two methods for persistence:

RDB snapshots: Store data snapshots at specified time intervals, which is a full backup.
AOF log files: Log the commands that modify data in memory, providing continuous incremental backups.

After Redis 4.0, a hybrid persistence feature was introduced.

RDB#

Related Configuration:

# Time policy, automatically triggered at intervals
save 900 1
save 300 10
save 60 10000

# File name
dbfilename dump.rdb

# File save path
dir /home/work/app/redis/data/

# If persistence fails, whether the main process should stop writing
stop-writes-on-bgsave-error yes

# Whether to compress
rdbcompression yes

# Whether to check during import
rdbchecksum yes

RDB persistence can be triggered in two ways: manually and automatically by Redis.

Manual Trigger#

save: Blocks the current Redis server until persistence is complete, not recommended for online use.
bgsave: Forks a child process responsible for the persistence process, so blocking only occurs during the fork.

The save command is not commonly used; the execution flow of bgsave is as follows:

Principle of bgsave

fork: Redis creates a child process through fork to perform the bgsave operation.
cow: Copy on write; after the child process is created, the parent and child processes share the data segment, allowing the parent process to continue providing read and write services while dirty page data gradually separates from the child process.

Automatic Trigger#

Automatic triggering scenarios:

Automatically triggered according to the save m n rules configured in the configuration file.
When a full copy is made from a replica node, the master node sends the RDB file to the replica node to complete the copy operation, triggering bgsave.
When executing debug reload.
When executing shutdown, if AOF is not enabled, it will also trigger.

AOF#

Related Configuration:

# Whether to enable AOF
appendonly yes

# File name
appendfilename "appendonly.aof"

# Synchronization method
appendfsync everysec

# Whether to synchronize during AOF rewriting
no-appendfsync-on-rewrite no

# Rewriting trigger configuration
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb

# How to handle errors when loading AOF
aof-load-truncated yes

# AOF rewriting strategy
aof-rewrite-incremental-fsync yes

The entire AOF process can be roughly divided into two steps: one is the real-time writing of commands (if configured as appendfsync everysec, there will be a 1s loss), the second is the rewriting of the AOF file.

Redis will validate parameters and process logic after receiving client modification commands. If everything is fine, it immediately stores the command text in the AOF log, meaning it executes the command before storing the log. This is different from storage engines like LevelDB and HBase, which store logs before processing logic.

AOF Rewriting#

As Redis runs for a long time, the AOF log grows longer. If the instance crashes and restarts, replaying the entire AOF log can be very time-consuming, leading to prolonged unavailability of Redis services. Therefore, AOF log slimming, or rewriting, is necessary.

Redis provides the bgrewriteaof command to slim down the AOF log. The principle is to open a child process to traverse memory and convert it into a series of Redis operation commands, serializing them into a new AOF log file. After serialization, the incremental AOF logs generated during the operation are appended to this new AOF log file, and once completed, it immediately replaces the old AOF log file.

Log Loss#

AOF logs exist as files. When the program writes to the AOF log file, it actually writes the content to a memory cache allocated by the kernel for the file descriptor, and the kernel asynchronously flushes the data back to disk. If the machine suddenly crashes, the log content may not have been fully written to disk, resulting in log loss.

Linux's glibc provides the fsync(int fd) function to force the specified file's content from the kernel cache to disk. In production environment servers, Redis typically performs an fsync operation approximately every 1 second, and this interval can be configured. This is a compromise between data safety and performance, maintaining high performance while minimizing data loss.

Reference Links#

Understanding Redis Persistence Principles in One Article

Redis's Synchronization Mechanism#

Redis can use master-slave synchronization and slave-slave synchronization. During the first synchronization, the master node performs a bgsave and simultaneously records subsequent modification operations in an in-memory buffer. Once completed, the RDB file is fully synchronized to the replica node, which loads the RDB image into memory. After loading is complete, it notifies the master node to synchronize the modification records to the replica node for replay, completing the synchronization process.

Redis Transactions#

Thanks to Redis's single-threaded model, its transaction model is straightforward to use.

Each transaction operation has begin, commit, and rollback; begin indicates the start of the transaction, commit indicates the submission of the transaction, and rollback indicates the rollback of the transaction.

In form, Redis looks similar, with multi/exec/discard. multi indicates the start of the transaction, exec indicates the execution of the transaction, and discard indicates the abandonment of the transaction.

Redis's Cache Expiration Mechanism#

As a caching system, it is necessary to periodically clean up invalid data, requiring a primary key expiration and eviction strategy.

Expiration Deletion Strategy#

Redis places each key with an expiration time into a separate dictionary, which will be periodically traversed to delete expired keys. In addition to periodic deletion, it also uses a lazy strategy to delete expired keys, meaning that when a client accesses this key, Redis checks the key's expiration time and deletes it immediately if expired. Periodic deletion is centralized, while lazy deletion is scattered.

Six Eviction Strategies#

When Redis memory exceeds the maxmemory limit, data in memory will start to frequently swap with disk, which can severely degrade Redis performance. Therefore, Redis provides several eviction strategies:

noeviction: Will not continue to serve write requests (DEL requests can still be served), but read requests can continue. This ensures that data is not lost, but it prevents ongoing business operations. This is the default eviction strategy.
volatile-lru: Attempts to evict keys with expiration times, prioritizing the least recently used keys. Keys without expiration times will not be evicted, ensuring that persistent data is not suddenly lost.
volatile-ttl: Evicts keys with expiration times, prioritizing those with the smallest remaining lifetime ttl.
volatile-random: Evicts keys with expiration times, but the evicted keys are randomly selected from the set of expired keys.
allkeys-lru: Unlike volatile-lru, this strategy evicts keys from the entire key set, not just the expired keys. This means that keys without expiration times can also be evicted.
allkeys-random: Randomly selects keys to evict from the entire key set.

Redis's Hash Slots#

Redis clusters do not use consistent hashing but introduce the concept of hash slots. There are 16384 hash slots in a Redis cluster, and each key is hashed using CRC16 and then modulo 16384 to determine which slot to place it in. Each node in the cluster is responsible for a portion of the hash slots.

Pipelining in Redis#

Multiple commands can be sent to the server without waiting for a reply, and the responses can be read in a single step, improving IO efficiency.

Redis Clusters#

Redis Sentinel focuses on high availability, automatically promoting a slave to master when the master fails, continuing to provide services.
Redis Cluster focuses on scalability, using clustering for sharding storage when a single Redis instance runs out of memory.

Cache Scenario Issues#

Cache Penetration#

Data that needs to be queried does not exist in either the cache or the database, which may lead to malicious high-concurrency requests attacking and causing the backend DB to crash.

Solution Ideas:

Cache null values: After the first query, set the value of non-existent keys to null, and subsequent requests for this key will directly return null, but be sure to set an appropriate expiration time.
Bloom Filter: Use a Bloom filter to determine whether an element (a specific key) exists.

Cache Avalanche#

When setting caches, if the same expiration time is used, a large-scale cache invalidation may occur at a certain moment, leading to queries to the database, which may further overload the database.

Solution Ideas:

Stagger expiration time settings: For example, add a random value to the original expiration time to reduce the repetition rate of cache expiration times.
Rate limiting & degradation to avoid overwhelming the backend DB due to excessive request pressure.
Use locks or queues to prevent too many requests from hitting the backend DB at the same time.
Add a cache marker time to each cached data value, which is earlier than the actual set cache time. When the cache marker expires, it will notify other threads to update the cache, while still returning the old cached data until the cache update is complete.

Cache Breakdown#

At a certain point, this cache expires, and many queries for this cache occur simultaneously. Upon discovering the cache is invalid, data is loaded from the backend DB and updated to the cache. If the request volume is too high, it may cause the DB to become overloaded and crash.

Note:
Cache breakdown and cache avalanche are different; the former targets a single key, while the latter involves many keys.

Solution Ideas:
Refer to cache avalanche solutions, similar.

Cache Preheating#

After the system goes live, relevant cached data is preloaded into the cache to avoid the first request querying and loading from the DB to update the cache.

Cache Update#

Update the Database First, Then Update the Cache#

Not applicable due to the following issues:

Thread safety issues may lead to dirty data. For example, if multiple write requests arrive, the last write request data should be updated to the cache, but due to network fluctuations, the actual data updated to the cache may be from one of the earlier writes.
Not suitable for write-heavy, read-light scenarios, as the cache is frequently updated, wasting performance.

Delete Cache First, Then Update the Database#

Problem Analysis:
If one request A performs an update operation while another request B performs a query operation, the following situation may occur:

Request A performs a write operation and deletes the cache.
Request B queries and finds the cache does not exist.
Request B queries the database and gets the old value.
Request B writes the old value into the cache.
Request A writes the new value into the database.

This situation can lead to inconsistencies. Moreover, if no expiration time strategy is set for the cache, the data will remain dirty indefinitely.

Solution: Delayed Double Delete Strategy

First, delete the cache.
Then write to the database (these two steps are the same as before).
Sleep for a short time, such as 1 second, and delete the cache again.

This way, any dirty data generated within 1 second can be cleared, and this sleep time needs to be evaluated based on the actual business scenario.

Update the Database First, Then Delete the Cache (Common Strategy)#

Invalidation: The application first retrieves data from the cache; if not found, it retrieves from the database and, upon success, places it in the cache.
Hit: The application retrieves data from the cache and returns it.
Update: First, store the data in the database; upon success, invalidate the cache by deleting it.

Problem Analysis:
Assuming there are two requests, one request A performs a query operation, and another request B performs an update operation, the following situation may arise:

The cache just expires.
Request A queries the database and gets an old value.
Request B writes the new value into the database.
Request B deletes the cache.
Request A writes the old value into the cache.

The condition for data inconsistency arises when the database write operation in step 3 is faster than the database read operation in step 2. However, since database read operations are generally much faster than write operations, this probability is quite low.