Nine Django Celery Performance Tips

I was working with Celery Tasks a lot this year and I want to share my experience and thoughts on how to overcome performance optimizations, how to simplify visibility of your background jobs and simplify certain operations.

  1. Keep an eye on RabbitMQ memory consumption.
    Messages that are not processed(no worker is reading from the queue) will pile up until the server stops accepting new tasks.
  2. For recurring scheduled tasks make sure you use locking (with cache.lock(“lock_name”))
    If a recurring task has some ID or argument, then you might want to include this identifier to the lock name ( with cache.lock(f”lock_name_{object_id}”) )
    When using locking, don’t forget to use the timeout argument. Needed for cases when worker stops working due to restart during new release deployment – this lock must not leave forever. At the same time timeout must be long enough not to expire before task ends its work. Otherwise, there will be race condition – when two tasks are running at the same time
  3. Have separate workers and queues for tasks that can clog your queue. You can have a default queue for most of the tasks, but there is one specific task that can take a lot of time to finish. Or there can be a type of task that is spawned in hundreds. Until they all get processed, no other task can be processed. To address this problem, you can make a separate queue and a dedicated worker for this queue. In this case, this type of tasks will not affect other tasks, especially if they are time-sensitive.
    Example: your web app processes video files, and there are a lot of such tasks in the queue, enough to occupy the queue for a couple of hours. A user comes to the site and wants to reset a password. If the app uses the same queue for video processing tasks and sending password reset emails – user will have to wait. But if you make a dedicated queue and worker for video processing and sending emails are in the default queue and processed by default worker – sending emails will not be affected by video processing tasks at all.
  4. Concurrency doesn’t always make things faster. If you have a task that imports data from CSV files and stores it to the database and you have a lot of such tasks, you might want to have such tasks running in parallel. If this task does very little processing and most of the time is spent on reading and writing to the database then making more such tasks run at the same time will only make overall process slower because database can be locking the whole table while inserting/updating records and other processes will have to wait, plus it will have to deal with concurrent requests. In result, each of those tasks will work slower, and you gained no performance benefits.
  5. For recurring scheduled tasks use “expires” argument. If task is meant to run every 30 minutes, and due to busy queue or performance degradation it gets called every 40-50 minutes tasks will start to pile up. To avoid it – make sure you include “expires” option, with value a bit smaller than your schedule. Example: tasks run every 30 minutes, so in seconds it is 30*60=1800 seconds. Make your timeout 30*60-5 = 1,795 seconds so that when next same task arrives it gets declined due to expiration date and a new one will come shortly.
  6. Passing task progress and results via cache, not task results. This approach can simplify your code and make reading for these results agnostic of how the data is calculated.
    There is a way to save the task ID when it is called and make task update its progress. For example, how much lines of CSV files were processed.
    But over time, I found that it is way more convenient when you use Redis cache to do that and avoid learning about Celery task metadata.
    Easier way is to store the progress data in JSON string in Redis cache with object id in the key.
    Imagine you upload a file and want to see what is the progress of processing it. You can pull this data from a cache with a key like this: cache.get(f”file_process_task_{file_id}”)On the Celery task side you write to this cache like this cache.set(f”file_process_task_{file_id}”, json.dumps(dict(done=5, total=15)), ttl=3000)

    So workflow here is the following:
    User uploads a file. The task is created. The cache key is empty. In this case, we assume processing has not started yet and we display a message to the user “Processing is about to start”.
    Then task starts to work and sets cache value with done=0 and total=15
    For every 5, 10 or 100 lines it updates the cache to reflect numbers of processed lines.
    If the user updates the page – data will be pulled from the cache and page will say “done 5 out of 15”.
    Then when the file is processed it should be reflected somewhere in database and cache is not checked anymore. Argument ttl=3000 says when the cache should expire in seconds. We don’t need to do anything to clean this cache – after 3000 seconds it will go away on its own.

  7. Keep track of how much time does it take to complete each task over time. You can search logs where it says that task succeeded for N seconds. The problem with background tasks is that when tasks start to work slowly it is not clear right away like it is for web request-response case when you see how the page is slow to load. You have to make additional effort to track what’s going on with the background life of your app.
  8. Keep an eye on how much messages are in the queue to catch the moment when queue is overloaded. In this case, you might want to increase the concurrency of a worker or have a separate queue for different types of tasks. Make a separate queue for super time-sensitive tasks, another for regular tasks.
  9. Keep RAM-hungry tasks in a separate queue. It is somewhat similar to what I already wrote, but a different angle. Imagine that most of your tasks can run on a simple machine, but there are tasks that require processing big files or datasets. In this case, you can put a worker for the queue for such tasks on a server with more RAM. Regular tasks don’t need so much RAM so you don’t have to scale servers(dynos, if you use Heroku) for regular worker which in result will you some money.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.