Skip to content

Commit

Permalink
Add static spider (mostly for testing)
Browse files Browse the repository at this point in the history
  • Loading branch information
wvengen committed Feb 15, 2024
1 parent 423aea2 commit df90e1a
Show file tree
Hide file tree
Showing 2 changed files with 24 additions and 1 deletion.
12 changes: 11 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,15 @@ so that one can get started easily.

## Scraped site

This spider returns quotes from [quotes.toscrape.com](https://quotes.toscrape.com).
This project contains two spiders. The `quotes` spider returns quotes from
[quotes.toscrape.com](https://quotes.toscrape.com).

The `static` spider returns a single dummy quote without accessing the network.
This can be used for testing. There are several settings and environment variables
that modify its behaviour:
- spider setting `STATIC_TEXT` - quote text (default _To be, or not to be_)
- spider setting `STATIC_AUTHOR` - quote author (default _Shakespeare_)
- environment variable `STATIC_TAGS` - quote tags (default _static_)

## Running locally

Expand All @@ -33,6 +41,7 @@ $ scrapy list
```
> ```
> quotes
> static
> ```
```sh
$ scrapy crawl quotes
Expand Down Expand Up @@ -67,6 +76,7 @@ docker run --rm example scrapy list
```
> ```
> quotes
> static
> ```
```sh
Expand Down
13 changes: 13 additions & 0 deletions example/spiders/static_spider.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
import os
from scrapy import Spider

class StaticSpider(Spider):
name = "static"
start_urls = ["file:///dev/null"]

def parse(self, response):
yield {
'text': self.settings.get('STATIC_TEXT', 'To be or not to be'),
'author': self.settings.get('STATIC_AUTHOR', 'Shakespeare'),
'tags': os.getenv('STATIC_TAGS', 'static')
}

0 comments on commit df90e1a

Please sign in to comment.