Docker Infrastructure Guide¶
Onion Peeler leverages Docker to provide a secure, isolated environment for scraping. This guide explains how the multi-container architecture works.
Prerequisites¶
- Docker Desktop, Docker Engine, or OrbStack.
- Docker Compose V2. Install Docker Compose here.
Container Architecture¶
The deploy/docker-compose.yml defines three specialized services. This system uses the Sidecar Pattern, where containers share the same network stack.
| Service | Container Name | Role |
|---|---|---|
dw_vpn |
dw_vpn |
The "gateway". Runs Gluetun with Mullvad VPN. |
tor |
tor |
Provides a Tor SOCKS/HTTP proxy. Routes all its traffic through dw_vpn. |
scraper |
scraper |
The application container. Inherits the network of dw_vpn. |
1. dw_vpn (Gluetun)¶
The central security hub. It establishes a Wireguard connection to Mullvad. Because of cap_add: [NET_ADMIN], it can manage the routing for other containers.
2. tor¶
This container runs a Tor relay service. By using network_mode: "service:dw_vpn", its own connection to the Tor entry nodes is masked by the VPN.
3. scraper¶
This container houses the Python code. It is also in the VPN's network namespace, meaning any "clearweb" requests (like to check-ip services) will report the VPN's IP, not yours.
How it works: Proxy Middleware¶
Even though the scraper container is inside the VPN network, it still needs to know when to use the tor container's proxy.
- Clearweb Requests: Scrapy sends these directly. They exit through the
dw_vpngateway automatically. - Onion Requests (
.onion): TheProxyMiddlewareinside Onion Peeler detects the.onionsuffix and routes the request tolocalhost:9050(the Tor SOCKS proxy).
Because the containers share a network namespace, localhost for the scraper is the dw_vpn container, which exposes the Tor ports.
Running CLI Commands with Docker¶
The most effective way to use the Scrapy CLI in the containerized environment is through docker compose run --rm scraper. This ensures the entire network stack (VPN + Tor) is active before the command executes.
One-Off Tasks (Recommended)¶
Networking Visualization¶
graph LR
subgraph Docker_Network
VPN[dw_vpn / Mullvad]
Tor[tor container]
Scraper[scraper container]
end
Scraper -- Clearweb --> VPN
Scraper -- .onion --> Tor
Tor --> VPN
VPN -- Internet --> WWW((World Wide Web))
VPN -- Tor Entry Nodes --> DarkNet((Tor Network))
To monitor the health of this chain: