This documents explains how to install and configure Scrapyd, to deploy and run your Scrapy spiders.
Scrapyd depends on the following libraries, but the installation process takes care of installing the missing ones:
- Python 2.6 or above
- Twisted 8.0 or above
- Scrapy 0.17 or above
Installing Scrapyd (generic way)¶
How to install Scrapyd depends on the platform you’re using. The generic way is to install it from PyPI:
pip install scrapyd
If you plan to deploy Scrapyd in Ubuntu, Scrapyd comes with official Ubuntu packages (see below) for installing it as a system service, which eases the administration work.
Other distributions and operating systems (Windows, Mac OS X) don’t yet have specific packages and require to use the generic installation mechanism in addition to configuring paths and enabling it run as a system service. You are very welcome to contribute Scrapyd packages for your platform of choice, just send a pull request on Github.
Installing Scrapyd in Ubuntu¶
Scrapyd comes with official Ubuntu packages ready to use in your Ubuntu
servers. They are shipped in the same APT repos of Scrapy, which can be added
as described in Scrapy Ubuntu packages. Once you have added the Scrapy APT
repos, you can install Scrapyd with
apt-get install scrapyd
This will install Scrapyd in your Ubuntu server creating a
which Scrapyd will run as. It will also create the directories and files
Scrapyd main log file.
The standard output captured from Scrapyd process and any sub-process spawned from it.
The standard error captured from Scrapyd and any sub-process spawned
from it. Remember to check this file if you’re having problems, as the errors
may not get logged to the
Besides the main service log file, Scrapyd stores one log file per crawling process in:
ID is a unique id for the run.
Directory used to store data files (uploaded eggs and spider queues).