- Headline
Django, Celerybeat and Celery with MongoDB as the Broker
- Date
- November 22nd, 2011
- Category
- Developer
- Story
Enhance your user experience by preventing your users from having to wait long periods of time for certain actions to occurs. There are times when you want to send an email to many people or do other processor intensive work that you won’t want your user to have to wait on. In cases like these, it’s smart to setup tasks to run in the background that you can kickoff so your user can continue browsing your site and perform actions without long wait times.
Below I’ll show you how to setup Celery and Celerybeat using MongoDB as the broker. I’m running Django 1.3 with a MongoDB backend. This already assumes you have Django running with MongoDB properly configured.
Celery
Celery is a distributed task queue that will assist with such background tasks. Celery executes tasks synchronously or asynchronously with a set of worker nodes, or processes, that run in the background listening and waiting until they are called upon to perform tasks.
Install
Use pip to install Celery and django-celery to get started:
pip install celery pip install django-celery
Celery, django-celery and it’s dependencies that get installed:
celery 2.4.1 anyjson 0.3.1 # Dependency for celery kombu 1.4.3 # Dependency for celery amqplib 1.0.2 # Dependency for celery django-celery 2.4.1 django-picklefield 0.1.9 # Dependency for django-celery
Verify everything was installed correctly by going into the python shell and importing celery:
$ python >>> import celery >>>
If you don’t see any errors then Celery was successfully installed.
Settings.py
Once you have Celery installed, update your django settings.py file with the djcelery app:
INSTALLED_APPS += ('djcelery',)
and include the Celery configuration using MongoDB as the broker:
CELERY_RESULT_BACKEND = "mongodb" CELERY_MONGODB_BACKEND_SETTINGS = { "host": "127.0.0.1", "port": 27017, "database": "celery", "taskmeta_collection": "my_taskmeta" # Collection name to use for task output } BROKER_BACKEND = "mongodb" BROKER_HOST = "localhost" BROKER_PORT = 27017 BROKER_USER = "" BROKER_PASSWORD = "" BROKER_VHOST = "celery" # Find and register all celery tasks. Your tasks need to be in a # tasks.py file to be picked up. CELERY_IMPORTS = ('my_tasks.tasks', )
You should now be able to start celery through your projects manage.py file:
$ python manage.py celeryd
Note: If you see Task Not Registered error messages, you likely need to add the tasks.py file to the CELERY_IMPORTS setting.
Writing Your First Task
Now let’s create a task and test that everything is working properly. Create the following my_tasks app in your project:
- my_project |- my_tasks |- __init__.py |- tasks.py - __init__.py - settings.py
Make sure my_tasks is in your INSTALLED_APPS and make sure you register your tasks.py file is in CELERY_IMPORTS in settings.py:
INSTALLED_APPS += ('my_tasks',) CELERY_IMPORTS = ('my_tasks.tasks',)
Inside your tasks.py file, create your first task:
from celery.decorators import task @task() def add(x, y): return x + y
Celery registers tasks when it starts so we need to restart Celery so it can register the new task. Once you restart Celery, open a new terminal and test that the task works. Open the python shell from the project root and try to run the task we just created:
$ python manage.py shell >>> from my_tasks import tasks >>> result = tasks.add.delay(5,5) >>> result.ready() True >>> result.successful() True
If the output of result.ready() is True then your task setup was successful!
Celery init.d
Once you have your project configured, you’ll want to easily be able to start, stop and restart Celery processes in the background. Celery provides very useful default init.d script to assist you with this which can be found at:
https://github.com/ask/celery/blob/master/contrib/generic-init.d/celeryd
Copy that file as is and place it in
/etc/init.d/celeryd
Celery Defaults
Next, you want to create a file that has your Celery settings. It will be located at
/etc/default/celeryd
. This is where the Celery init.d script will look for your default settings.# Name of nodes to start, here we have a single node #CELERYD_NODES="w1" # Or use as many nodes as you like. Here we have 4 worker nodes. # Use 1 worker per server core. CELERYD_NODES="w1 w2 w3 w4" # Where to chdir at start. CELERYD_CHDIR="/path/to/your/project" # Log level. This will be deprecated in CELERY 3.0. Can be one of # DEBUG, INFO, WARNING, ERROR or CRITICAL. CELERYD_LOG_LEVEL="INFO" # Celery location for virtualevn CELERYD="python $CELERYD_CHDIR/manage.py celeryd --loglevel=$CELERYD_LOG_LEVEL" # How to call "manage.py celeryd_multi" CELERYD_MULTI="$CELERYD_CHDIR/manage.py celeryd_multi" # Task hard time limit in seconds. The worker processing the task # will be killed and replaced with a new one when this is exceeded. # 86400 = 24 hours CELERYD_TASK_TIME_LIMIT=86400 # Extra arguments to celeryd CELERYD_OPTS="--concurrency=8" # Name of the celery config module. CELERY_CONFIG_MODULE="celeryconfig" # %n will be replaced with the nodename. CELERYD_LOG_FILE="/var/log/celeryd_%n.log" CELERYD_PID_FILE="/var/run/celeryd_%n.pid" # Workers should run as an unprivileged user. Don't run celery as root! CELERYD_USER="celery" CELERYD_GROUP="" # Name of the projects settings module. export DJANGO_SETTINGS_MODULE="settings"
Once you have those two files in place you should be able to start Celery using the init.d script:
$ /etc/init.d/celeryd start celeryd-multi v2.2.6 > Starting nodes... > w1.some-comp-name: OK
Celerybeat
Celerybeat is a background scheduler process that will allow you to run cron like, periodic, tasks.
Celerybeat init.d
Celery also provides a generic init.d script for Celerybeat. A very handy init.d script that makes starting, stopping and restarting Celerybeat very easy.
https://github.com/ask/celery/blob/master/contrib/generic-init.d/celerybeat
Copy and place that file as is in
/etc/init.d/celerybeat
Celerybeat Settings
Celery and Celerybeat default settings can live in the same file if you like. Open your Celery settings file we created earlier:
$ vi /etc/default/celeryd
and append the following Celerybeat settings to the bottom of the file:
# Where to chdir at start. CELERYBEAT_CHDIR="/path/to/your/project" # Path to celerybeat CELERYBEAT="$CELERYBEAT_CHDIR/manage.py celerybeat" # Extra arguments to celerybeat. This is a file that will get # created for scheduled tasks. It's generated automatically # when Celerybeat starts. CELERYBEAT_OPTS="--schedule=/var/run/celerybeat-schedule" # Log level. Can be one of DEBUG, INFO, WARNING, ERROR or CRITICAL. CELERYBEAT_LOG_LEVEL="INFO" # Log file locations CELERYBEAT_LOGFILE="/var/log/celerybeat.log" CELERYBEAT_PIDFILE="/var/run/celerybeat.pid" # Celerybeat should run as an unprivileged user. Don't run as root! CELERYBEAT_USER="celery" CELERYBEAT_GROUP=""
Don’t Run Celery as the Root User
Celery recommends not running Celery or Celerybeat as the root user. One way to check which user is running Celery is the issue the following command in a terminal:
$ ps aux | grep celery
This will show all running processes that have to do with Celery. The first portion of the output will show the user running the process. If you see something like:
root ... /usr/bin/python /path/to/your/project/manage.py celeryd --loglevel=INFO --concurrency=8 -n w1.some_comp_ip --logfile=/var/log/celeryd_w1.log --pidfile=/var/run/celeryd_w1.pid
Then you’ll see root as the user running the process. Create a new unprivileged user to run these processes:
$ adduser --system --no-create-home --disabled-login --disabled-password --group celery
That’s it! You should now have Celery worker(s) running in the background ready to run tasks from your application!
- Comments
- 2 Comments »
2 Comments
Leave a reply
You must be logged in to post a comment.
Got it. Thanks Nick!
Nice article, but I think there is a typo in the last sentence:
did you mean “not” or “now”, when you said:
“You should not have Celery worker(s) running in the background ready to run tasks from your application!”