Skip to main content
  1. Posts/

Delete Keys Matching A Pattern with py-redis-cluster

·179 words·1 min·
Python Redis
Table of Contents

We have a Python web service where we store some key-val pairs in redis. Occasionally, I want to delete some of the keys matching a certain pattern. Current, we are using redis-py-cluster for redis-related operations.

We can use scan_iter() method to search such keys and delete them. The first parameter to scan_iter() is the matching pattern. The code looks roughly like this:

for k in redis_client.scan_iter("prefix:*"):
    redis_client.delete(k)

The above code kinda works, but it is awfully slow. We can add the count option to accelerate deletion. Option count specify how many keys per scan will return. According to here, the default count is 10, which is rather small.

batch_size = 500
keys = []

for k in redis_client.scan_iter("prefix:*", count=batch_size):
    keys.append(k)
    if len(keys) >= batch_size:
        redis_client.delete(*keys)
        keys = []
if len(keys) > 0:
    redis_client.delete(*keys)

Using a large count will speed up the deletion process significantly. I have benchmarked on about 20000 keys. Here is what I have found:

batch size  Time taken (seconds)
100         929
500         260
1000        175
2000        133
4000        107
5000        106
10000       93

Refs
#

Related

Run the Job Immediately after Starting Scheduler in Python APScheduler
·322 words·2 mins
Python APScheduler
Retry for Google Cloud Client
·197 words·1 min
Python GCP
Make Python logging Work in GCP
·570 words·3 mins
Python Logging GCP