Description
Juan Riaza - Dive into Scrapy [EuroPython 2015] [21 July 2015] [Bilbao, Euskadi, Spain]
Scrapy is a fast high-level screen scraping and web crawling framework, used to crawl websites and extract structured data from their pages. It can be used for a wide range of purposes, from data mining to monitoring and automated testing.
In this talk some advanced techniques will be shown based on how Scrapy is used at Scrapinghub.
Goals:
- Understand why its necessary to Scrapy-ify early on.
- Anatomy of a Scrapy Spider.
- Using the interactive shell.
- What are items and how to use item loaders.
- Examples of pipelines and middlewares.
- Techniques to avoid getting banned.
- How to deploy Scrapy projects.