Celery: Distributed Task Queue
Celery is an asynchronous task queue/job queue based on distributed message passing. Its normally focused on real-time operations and can also be set up on a schedule. You can use celery to process code asynchronously so your server/user can keep going while Celery takes care of its tasks. Some example of great times to use Celery are:
- Generating assets after upload
- Notifying a set of users when an event happens
- Keeping a search index up to date
- Replacing cronjobs (backups, cleanup)
It does this by using a distributed system for processing messages on a task queue with a focus on real-time processing and support for task scheduling. These tasks can be triggered by web requests, other tasks or be set up with a schedule. When the task is triggered, Celery(that is in your application) sends it to a message broker(ie.RabbitMQ) for the work to actually be done. If this is sounding confusing don't worry, here is a visual: Setup Choices
- Uses a Message Broker
- Uses a Result Backend (or no result backend)
- Uses a concurrency solution (Multiprocessing or green threads)
So before you start, you'll need to decided what messaging broker you'll use. The two most popular are Redis and RabbitMQ. There are other messaging brokers, but these two are the most supported. I decided to use RabbitMQ because I don't need to store the results somewhere else, but I also thought that since I'm using Celery, RabbitMQ is a great name for its pair.Another major feature of using Celery is that its easy and simple to test the code. Since its in python, it can be included in tests.py along with the rest of your app.To start a celery project, you'll need to first pip install Celery and the boiler plate code:command line:[code language="text"]>>> pip install celery[/code]python script:[code language="python"]from celery import Celeryapp = Celery('tasks', broker='amp://')@app.taskdef add(x, y):return x + y[/code]Example with Heartbeat: I decided to make a silly app that shows the response of which Doctor Who doctor is the person's favorite. I started by creating the config file where I tell celery what to do, what is the broker host information, what task to do and set up the "heartbeat"- which is the the scheduler.In my client.py I instantiate a Celery object and I use the celeryconfig.py file for the instance's config_from_object method.Now the fun part happens! In my task.py, I import task so I can use the @ decorator on task to so my celery can be aware that this is the task that it should be doing. I define my function directly under the decorator and it does not take in any parameters. I have the function open a text file that starts out with only the number 1. It iterates over the file and strips the white space on the right and splits the string into a list of strings. Then it prints out the string that nums[0] is the computer's favorite doctor and then does a for loop of the remaining elements in the list, printing out that the next doctor is the computer's favorite doctor. I used a lamba express that takes in a number and returns the number as a string with the suffix of 'st, 'nd, 'rd or 'th. That way the sentence sounds more natural. It prints a string(s) to the console. At the end it grabs the last element and transform's it into an integer and then increases it by one. Then it write to the file the the increase number.To run this, you'll need to download RabbitMQ so you can have a message broker that Celery can use. Then you'll need two terminals, one to run rabbitmq:[code language="text"]>>> cd rabbitmq_server-3.5.6/>>> sbin/rabbitmq-server[/code] The other terminal (at the level of your client.py file) for the worker:[code language="text"]>>> celery worker -l info --beat[/code]
Check out the code here
For a more robust example of Celery having a heart beat and interacting with a database check out Rideminder.
You can also see a video of this code:
https://youtu.be/waD4MEj8WGw