Skip to content

base

onion_peeler.spiders.base

ConfigDrivenSpider(site_id=None, site=None, start_url=None, item_type=None, *args, **kwargs)

Bases: Spider

Base spider that drives extraction from site TOML configuration.

Source code in src/onion_peeler/spiders/base.py
def __init__(
        self,
        site_id: str | None = None,
        site: str | None = None,
        start_url: str | None = None,
        item_type: str | None = None,
        *args,
        **kwargs,
):
    super().__init__(*args, **kwargs)
    self.site_id = site_id or site
    self.site_config = load_site_config(self.site_id)
    self._override_start_url = start_url
    self._override_item_type = item_type