Hype Proxies

How to Monitor Amazon Product Price with Python using ISP Proxies

In this tutorial, we have built a system to manage Amazon pricing. Behind the scenes, we have used ISP proxies.

Gunnar

Last updated -

Proxies

monitor amazon prices using python & isp proxies

In this article:

Title

Overview

In our small-sample testing, curl_cffi with impersonate="chrome" produced 10 successful fetches in 10 attempts against live Amazon product pages; the standard requests library returned CAPTCHA pages on 2 of 3 attempts in the same setup.

  • TLS impersonation. curl_cffi matches the Chrome TLS handshake, so requests are less likely to produce a fingerprint Amazon recognizes as automated.

  • Monthly cost. Around $70 ($65 for an entry-tier ISP proxy plan, $5 for a VPS), or around $65 on a free hosting tier such as Oracle Cloud's Always Free resources. The pricing here reflects HypeProxies' Pro plan; other providers will be in a similar range.

  • ISP versus datacenter proxies. Amazon classifies IPs by Autonomous System Number, or ASN (the network operator that owns each IP block). Datacenter ASNs such as AWS or OVH are challenged on most requests; consumer-ISP ASNs are challenged far less often.

  • Price extraction. The parser reads a JSON pricing blob that Amazon embeds in each product page, with CSS selectors as a fallback. This is more stable across DOM refactors than CSS-only parsing.

  • Sequential limit. Around 300 products at hourly cadence with sequential fetches; the ceiling scales further with async and a larger IP pool. To exceed it, replace the loop with curl_cffi.AsyncSession and asyncio.gather().

Source repository: github.com/triposat/amazon-price-monitor, a GitHub Actions cron variant of the same implementation.

Who is this for

This guide targets amazon.com. For other Amazon storefronts (amazon.de, amazon.co.uk, etc.), use proxies from the matching region, update the URL in fetch_product_page, and adjust the currency-symbol stripping in extract_price_text (the parser currently handles USD).

Suitable for: monitoring up to a few hundred products at hourly cadence (or several thousand at slower cadences; see the cycle limits in the Scaling and adapting section). The implementation relies on the static HTML that Amazon serves on product pages.

Not suitable for: products whose prices render only after a JavaScript variant selection (such as some Echo or Kindle pages); for these, use a headless browser with anti-detection features. Nodriver (Chrome-based), Camoufox (a stealth Firefox build), and SeleniumBase (with UC Mode) are common choices. Also not suitable for one-time bulk scrapes of 100,000 or more products in a single run, where rotating residential proxies are typically cheaper for that specific workload. ISP proxies are designed for recurring monitoring; rotating residential proxies are designed for bulk extraction. The two are different products for different jobs.

This guide does not use the Amazon Product Advertising API. The API requires signing up for the affiliate program, has rate limits that make multi-product monitoring difficult, and returns a smaller set of pricing data than the public product page. For ongoing price monitoring at the scale described here, scraping the public HTML is more practical.

Prerequisites

End-to-end setup takes 30 to 45 minutes once ISP proxies are available.

  • Python 3.10 or newer and a machine that runs continuously. A small rented Linux server (a Virtual Private Server, or VPS) at around $5 per month is enough; alternatives are covered in the Deploy continuously section.

  • An ISP proxy pool. An ISP proxy (sometimes called a static residential proxy) is a fixed IP address, owned by a consumer ISP such as AT&T or Frontier. This rules out two common alternatives: datacenter IPs and rotating residential pools that change addresses every few minutes. We use HypeProxies ISP proxies (around $65 per month for 50 IPs, unlimited bandwidth), but any reputable provider with consumer-ISP IPs works. ISP proxies are strongly recommended because Amazon's anti-bot system classifies IP ranges by ASN. Datacenter ASNs such as AWS, OVH, Hetzner, or DigitalOcean are challenged or blocked on most requests; consumer-ISP ASNs are challenged far less often.

A note on per-IP pricing. Amazon product pages are 200 to 500 KB each, and a 100-product hourly monitor pulls around 25 GB per month. At residential per-GB proxy rates ($7 to $10 per GB), this is $175 to $250 per month in bandwidth alone, before the base subscription. ISP proxies are billed per IP, which keeps the cost flat no matter the traffic volume. Most reputable providers using this pricing model behave the same way.

Step 1 – Install dependencies

pip install curl_cffi beautifulsoup4 lxml tenacity loguru pydantic tinydb apprise schedule
pip install curl_cffi beautifulsoup4 lxml tenacity loguru pydantic tinydb apprise schedule
pip install curl_cffi beautifulsoup4 lxml tenacity loguru pydantic tinydb apprise schedule

Each library handles a specific concern: curl_cffi for TLS impersonation, beautifulsoup4 and lxml for HTML parsing, tenacity for retry logic, loguru for structured logging, pydantic for configuration validation, tinydb for the price-history file, apprise for multi-channel alerts, and schedule for the cron loop.

On the choice of curl_cffi. Amazon checks the TLS handshake fingerprint of incoming requests. The Python requests library produces a recognizable urllib3 fingerprint, and in our small-sample testing it returned CAPTCHA pages on roughly 2 of 3 attempts, even on high-quality proxies. curl_cffi impersonates the full Chrome request signature: TLS handshake, HTTP/2 settings, and default headers. It exposes the same API as requests, so only the import statement changes. Avoid manually setting User-Agent or other headers, since the impersonation handles them and overrides can break the match.

Step 2 – Configure proxies and products

If you do not already have ISP proxies, sign up with a provider and get your credentials. HypeProxies and most other reputable ISP-proxy providers offer them in IP:PORT:USERNAME:PASSWORD format, ready to paste into your configuration.

The HypeProxies dashboard. The "Copy My Proxies" button gives the full list in IP:PORT:USERNAME:PASSWORD format, ready for the PROXIES list in config.py.

What to look for when choosing a provider:

  • Static IPs, not "rotating residential" sessions that return a different IP every few minutes. Static IPs build reputation with Amazon over time and trigger fewer challenges than rotating IPs.

  • US-based, owned by a consumer ISP. Check the ASN with whois <ip> or a service such as ipinfo.io. Frontier, AT&T, and RCN are consumer-ISP networks; AWS or OVH are datacenter networks, which Amazon will flag.

  • Unlimited bandwidth pricing, billed per IP rather than per GB. Per-GB pricing scales rapidly with traffic, which gets expensive for hourly monitoring.

Before purchasing, run a candidate provider's IPs through a fraud-score checker to confirm they are real consumer-ISP proxies, not datacenter IPs labeled as residential. The HypeProxies proxy checker returns fraud score, ASN, hostname, geolocation, and VPN/Tor detection in one result. IPQualityScore and Whoer are also free options.

Create config.py. Pydantic validates your configuration at load time, so a mistyped ASIN fails before any proxy requests are made.

# config.py

from urllib.parse import quote
from pydantic import BaseModel, field_validator

class ProxyConfig(BaseModel):
    host: str
    port: str
    user: str
    password: str

    @property
    def url(self):
        # quote() escapes special characters in credentials such as @, :, /, #
        return f"http://{quote(self.user, safe='')}:{quote(self.password, safe='')}@{self.host}:{self.port}"

class ProductConfig(BaseModel):
    asin: str
    name: str
    target_price: float

    @field_validator("asin")
    @classmethod
    def validate_asin(cls, v):
        if len(v) != 10:
            raise ValueError("ASIN must be exactly 10 characters")
        return v

# Replace with your own proxy credentials
PROXIES = [
    ProxyConfig(host="proxy1.example.com", port="8000", user="your_username", password="your_password"),
    ProxyConfig(host="proxy2.example.com", port="8001", user="your_username", password="your_password"),
    ProxyConfig(host="proxy3.example.com", port="8002", user="your_username", password="your_password"),
    ProxyConfig(host="proxy4.example.com", port="8003", user="your_username", password="your_password"),
    ProxyConfig(host="proxy5.example.com", port="8004", user="your_username", password="your_password"),
]

REQUEST_TIMEOUT = 15
MAX_RETRIES = 3
# config.py

from urllib.parse import quote
from pydantic import BaseModel, field_validator

class ProxyConfig(BaseModel):
    host: str
    port: str
    user: str
    password: str

    @property
    def url(self):
        # quote() escapes special characters in credentials such as @, :, /, #
        return f"http://{quote(self.user, safe='')}:{quote(self.password, safe='')}@{self.host}:{self.port}"

class ProductConfig(BaseModel):
    asin: str
    name: str
    target_price: float

    @field_validator("asin")
    @classmethod
    def validate_asin(cls, v):
        if len(v) != 10:
            raise ValueError("ASIN must be exactly 10 characters")
        return v

# Replace with your own proxy credentials
PROXIES = [
    ProxyConfig(host="proxy1.example.com", port="8000", user="your_username", password="your_password"),
    ProxyConfig(host="proxy2.example.com", port="8001", user="your_username", password="your_password"),
    ProxyConfig(host="proxy3.example.com", port="8002", user="your_username", password="your_password"),
    ProxyConfig(host="proxy4.example.com", port="8003", user="your_username", password="your_password"),
    ProxyConfig(host="proxy5.example.com", port="8004", user="your_username", password="your_password"),
]

REQUEST_TIMEOUT = 15
MAX_RETRIES = 3
# config.py

from urllib.parse import quote
from pydantic import BaseModel, field_validator

class ProxyConfig(BaseModel):
    host: str
    port: str
    user: str
    password: str

    @property
    def url(self):
        # quote() escapes special characters in credentials such as @, :, /, #
        return f"http://{quote(self.user, safe='')}:{quote(self.password, safe='')}@{self.host}:{self.port}"

class ProductConfig(BaseModel):
    asin: str
    name: str
    target_price: float

    @field_validator("asin")
    @classmethod
    def validate_asin(cls, v):
        if len(v) != 10:
            raise ValueError("ASIN must be exactly 10 characters")
        return v

# Replace with your own proxy credentials
PROXIES = [
    ProxyConfig(host="proxy1.example.com", port="8000", user="your_username", password="your_password"),
    ProxyConfig(host="proxy2.example.com", port="8001", user="your_username", password="your_password"),
    ProxyConfig(host="proxy3.example.com", port="8002", user="your_username", password="your_password"),
    ProxyConfig(host="proxy4.example.com", port="8003", user="your_username", password="your_password"),
    ProxyConfig(host="proxy5.example.com", port="8004", user="your_username", password="your_password"),
]

REQUEST_TIMEOUT = 15
MAX_RETRIES = 3

Createproducts.json to list the products to monitor:

{
  "products": [
    {"asin": "B07MHJFRBJ", "name": "Bounty Paper Towels", "target_price": 22.00},
    {"asin": "B00NTCH52W", "name": "Amazon Basics AA Batteries", "target_price": 8.99},
    {"asin": "B0CP9YB3Q4", "name": "Stanley Quencher Tumbler", "target_price": 20.00},
    {"asin": "B0BZYCJK89", "name": "Owala FreeSip Water Bottle", "target_price": 25.00}
  ]
}
{
  "products": [
    {"asin": "B07MHJFRBJ", "name": "Bounty Paper Towels", "target_price": 22.00},
    {"asin": "B00NTCH52W", "name": "Amazon Basics AA Batteries", "target_price": 8.99},
    {"asin": "B0CP9YB3Q4", "name": "Stanley Quencher Tumbler", "target_price": 20.00},
    {"asin": "B0BZYCJK89", "name": "Owala FreeSip Water Bottle", "target_price": 25.00}
  ]
}
{
  "products": [
    {"asin": "B07MHJFRBJ", "name": "Bounty Paper Towels", "target_price": 22.00},
    {"asin": "B00NTCH52W", "name": "Amazon Basics AA Batteries", "target_price": 8.99},
    {"asin": "B0CP9YB3Q4", "name": "Stanley Quencher Tumbler", "target_price": 20.00},
    {"asin": "B0BZYCJK89", "name": "Owala FreeSip Water Bottle", "target_price": 25.00}
  ]
}

Step 3 – Verify your proxies

Before writing any scraping logic, make sure the proxies are reachable and that your pool returns different IPs. Most "Amazon is blocking me" reports are caused by a single failing proxy in the pool, not by Amazon itself.

# test_proxy.py

from curl_cffi import requests as curl_requests
from config import PROXIES

for i, proxy in enumerate(PROXIES):
    try:
        r = curl_requests.get(
            "https://httpbin.org/ip",
            proxy=proxy.url,
            timeout=10,
            impersonate="chrome",
        )
        data = r.json()
        print(f"Proxy {i+1}: {data['origin']} – OK ({r.elapsed.total_seconds():.2f}s)")
    except Exception as e:
        print(f"Proxy {i+1}: FAILED {e}")
# test_proxy.py

from curl_cffi import requests as curl_requests
from config import PROXIES

for i, proxy in enumerate(PROXIES):
    try:
        r = curl_requests.get(
            "https://httpbin.org/ip",
            proxy=proxy.url,
            timeout=10,
            impersonate="chrome",
        )
        data = r.json()
        print(f"Proxy {i+1}: {data['origin']} – OK ({r.elapsed.total_seconds():.2f}s)")
    except Exception as e:
        print(f"Proxy {i+1}: FAILED {e}")
# test_proxy.py

from curl_cffi import requests as curl_requests
from config import PROXIES

for i, proxy in enumerate(PROXIES):
    try:
        r = curl_requests.get(
            "https://httpbin.org/ip",
            proxy=proxy.url,
            timeout=10,
            impersonate="chrome",
        )
        data = r.json()
        print(f"Proxy {i+1}: {data['origin']} – OK ({r.elapsed.total_seconds():.2f}s)")
    except Exception as e:
        print(f"Proxy {i+1}: FAILED {e}")

Run it:

python test_proxy.py
python test_proxy.py
python test_proxy.py

Expected output:

Proxy 1: 203.0.113.10  OK (1.33s)
Proxy 2: 203.0.113.11  OK (1.35s)
Proxy 3: 203.0.113.12  OK (1.30s)
Proxy 4: 203.0.113.13  OK (1.61s)
Proxy 5: 203.0.113.14  OK (2.06s)
Proxy 1: 203.0.113.10  OK (1.33s)
Proxy 2: 203.0.113.11  OK (1.35s)
Proxy 3: 203.0.113.12  OK (1.30s)
Proxy 4: 203.0.113.13  OK (1.61s)
Proxy 5: 203.0.113.14  OK (2.06s)
Proxy 1: 203.0.113.10  OK (1.33s)
Proxy 2: 203.0.113.11  OK (1.35s)
Proxy 3: 203.0.113.12  OK (1.30s)
Proxy 4: 203.0.113.13  OK (1.61s)
Proxy 5: 203.0.113.14  OK (2.06s)

One line per proxy, all returning different IPs (your IPs will be different from the example). Sub-2-second responses are typical. If any line says FAILED, fix that line before continuing. The usual cause is a typo in the credentials, or a port that the provider has not yet activated.

Step 4 – Build the scraper

The scraper handles TLS impersonation, proxy rotation, selective retry, and price extraction. The implementation is organized in three sections within scraper.py: imports and selectors, the fetcher, and the parsing entry point.

Most Amazon web scraping tutorials parse prices using CSS selectors only (span.a-price .a-offscreen and similar), which may break when Amazon refactors the DOM. The approach used here is more stable: Amazon embeds a JSON pricing blob in a hidden <div> on most product pages. The scraper reads that first and falls back to CSS selectors only when the JSON is missing.

The $24.42 price visible on the product page (left) is the same priceAmount value Amazon embeds in a hidden JSON div (right).

Imports, data model, and selectors

# scraper.py

import json
import re
import random
import time
from datetime import datetime
from itertools import cycle

from curl_cffi import requests as curl_requests
from bs4 import BeautifulSoup
from tenacity import retry, stop_after_attempt, wait_random, retry_if_exception_type
from loguru import logger
from pydantic import BaseModel, Field

from config import PROXIES, REQUEST_TIMEOUT, MAX_RETRIES


class PriceResult(BaseModel):
    asin: str
    title: str
    price: float | None = None
    availability: str = "Unknown"
    rating: str | None = None
    timestamp: datetime = Field(default_factory=datetime.now)


class RetryableError(Exception):
    """Temporary server-side error or anti-bot challenge.

    Raising this signals the retry decorator to try again with the next
    proxy in the pool. Permanent errors raise a plain Exception instead.
    """


# Amazon embeds price data in a hidden div on most product pages, in this format:
#   {"desktop_buybox_group_1": [{"priceAmount": 24.42, "buyingOptionType": "NEW", ...}]}
# This is more reliable than CSS selectors because Amazon's variant-picker UI
# (called "twister" internally, hence the class name) depends on it.
PRICE_JSON_SELECTOR = ".twister-plus-buying-options-price-data"

# Used as fallback when the JSON data above is missing on a given page.
PRICE_SELECTORS = [
    "span.a-price .a-offscreen",
    ".priceToPay .a-offscreen",
    "#corePriceDisplay_desktop_feature_div .a-offscreen",
]


def extract_price_text(tag):
    if tag is None:
        return None
    text = tag.get_text(strip=True)
    if not text:
        return None
    try:
        return float(text.replace("$", "").replace(",", ""))
    except ValueError:
        return None
# scraper.py

import json
import re
import random
import time
from datetime import datetime
from itertools import cycle

from curl_cffi import requests as curl_requests
from bs4 import BeautifulSoup
from tenacity import retry, stop_after_attempt, wait_random, retry_if_exception_type
from loguru import logger
from pydantic import BaseModel, Field

from config import PROXIES, REQUEST_TIMEOUT, MAX_RETRIES


class PriceResult(BaseModel):
    asin: str
    title: str
    price: float | None = None
    availability: str = "Unknown"
    rating: str | None = None
    timestamp: datetime = Field(default_factory=datetime.now)


class RetryableError(Exception):
    """Temporary server-side error or anti-bot challenge.

    Raising this signals the retry decorator to try again with the next
    proxy in the pool. Permanent errors raise a plain Exception instead.
    """


# Amazon embeds price data in a hidden div on most product pages, in this format:
#   {"desktop_buybox_group_1": [{"priceAmount": 24.42, "buyingOptionType": "NEW", ...}]}
# This is more reliable than CSS selectors because Amazon's variant-picker UI
# (called "twister" internally, hence the class name) depends on it.
PRICE_JSON_SELECTOR = ".twister-plus-buying-options-price-data"

# Used as fallback when the JSON data above is missing on a given page.
PRICE_SELECTORS = [
    "span.a-price .a-offscreen",
    ".priceToPay .a-offscreen",
    "#corePriceDisplay_desktop_feature_div .a-offscreen",
]


def extract_price_text(tag):
    if tag is None:
        return None
    text = tag.get_text(strip=True)
    if not text:
        return None
    try:
        return float(text.replace("$", "").replace(",", ""))
    except ValueError:
        return None
# scraper.py

import json
import re
import random
import time
from datetime import datetime
from itertools import cycle

from curl_cffi import requests as curl_requests
from bs4 import BeautifulSoup
from tenacity import retry, stop_after_attempt, wait_random, retry_if_exception_type
from loguru import logger
from pydantic import BaseModel, Field

from config import PROXIES, REQUEST_TIMEOUT, MAX_RETRIES


class PriceResult(BaseModel):
    asin: str
    title: str
    price: float | None = None
    availability: str = "Unknown"
    rating: str | None = None
    timestamp: datetime = Field(default_factory=datetime.now)


class RetryableError(Exception):
    """Temporary server-side error or anti-bot challenge.

    Raising this signals the retry decorator to try again with the next
    proxy in the pool. Permanent errors raise a plain Exception instead.
    """


# Amazon embeds price data in a hidden div on most product pages, in this format:
#   {"desktop_buybox_group_1": [{"priceAmount": 24.42, "buyingOptionType": "NEW", ...}]}
# This is more reliable than CSS selectors because Amazon's variant-picker UI
# (called "twister" internally, hence the class name) depends on it.
PRICE_JSON_SELECTOR = ".twister-plus-buying-options-price-data"

# Used as fallback when the JSON data above is missing on a given page.
PRICE_SELECTORS = [
    "span.a-price .a-offscreen",
    ".priceToPay .a-offscreen",
    "#corePriceDisplay_desktop_feature_div .a-offscreen",
]


def extract_price_text(tag):
    if tag is None:
        return None
    text = tag.get_text(strip=True)
    if not text:
        return None
    try:
        return float(text.replace("$", "").replace(",", ""))
    except ValueError:
        return None

PriceResult is the validated data model the scraper returns. Pydantic catches malformed data at the parsing boundary, rather than letting it spread three function calls deeper. RetryableError identifies the exceptions tenacity should retry; other exceptions stop the retry loop. PRICE_JSON_SELECTOR targets the embedded JSON blob; PRICE_SELECTORS is the CSS fallback chain.

The fetcher: TLS impersonation, proxy rotation, selective retry

class AmazonPriceScraper:
    def __init__(self):
        self._proxy_pool = cycle(PROXIES)

    def _get_next_proxy(self):
        return next(self._proxy_pool).url

    @retry(
        stop=stop_after_attempt(MAX_RETRIES),
        wait=wait_random(min=3, max=10),
        retry=retry_if_exception_type(RetryableError),
    )
    def fetch_product_page(self, asin):
        url = f"<https://www.amazon.com/dp/{asin}>"
        proxy = self._get_next_proxy()

        response = curl_requests.get(
            url, proxy=proxy, timeout=REQUEST_TIMEOUT, impersonate="chrome",
        )

        # 404 means the product page is gone. Skip it without retrying.
        if response.status_code == 404:
            logger.warning(f"Product {asin} not found (404)")
            return None

        # 429 (rate limited) and 5xx (server errors) are temporary. Retry.
        if response.status_code == 429:
            raise RetryableError(f"Rate limited (429) for {asin}")
        if 500 <= response.status_code < 600:
            raise RetryableError(f"Server error {response.status_code} for {asin}")

        # Other 4xx codes (403 Forbidden, 410 Gone, and so on) are permanent.
        # Raise a plain Exception so the retry loop stops at the first attempt.
        if response.status_code != 200:
            raise Exception(f"Permanent HTTP error {response.status_code} for {asin}")

        # Amazon serves several block-page variants when it detects automation.
        # The "dog page" carries an API support email; the soft challenge shows
        # a captcha validation URL or asks the user to type characters.
        body_lower = response.text.lower()
        if "api-services-support@amazon.com" in response.text:
            raise RetryableError(f"Amazon dog-page CAPTCHA for {asin}")
        if "/errors/validatecaptcha" in body_lower or "type the characters you see" in body_lower:
            raise RetryableError(f"Soft CAPTCHA challenge for {asin}")

        return response.text
class AmazonPriceScraper:
    def __init__(self):
        self._proxy_pool = cycle(PROXIES)

    def _get_next_proxy(self):
        return next(self._proxy_pool).url

    @retry(
        stop=stop_after_attempt(MAX_RETRIES),
        wait=wait_random(min=3, max=10),
        retry=retry_if_exception_type(RetryableError),
    )
    def fetch_product_page(self, asin):
        url = f"<https://www.amazon.com/dp/{asin}>"
        proxy = self._get_next_proxy()

        response = curl_requests.get(
            url, proxy=proxy, timeout=REQUEST_TIMEOUT, impersonate="chrome",
        )

        # 404 means the product page is gone. Skip it without retrying.
        if response.status_code == 404:
            logger.warning(f"Product {asin} not found (404)")
            return None

        # 429 (rate limited) and 5xx (server errors) are temporary. Retry.
        if response.status_code == 429:
            raise RetryableError(f"Rate limited (429) for {asin}")
        if 500 <= response.status_code < 600:
            raise RetryableError(f"Server error {response.status_code} for {asin}")

        # Other 4xx codes (403 Forbidden, 410 Gone, and so on) are permanent.
        # Raise a plain Exception so the retry loop stops at the first attempt.
        if response.status_code != 200:
            raise Exception(f"Permanent HTTP error {response.status_code} for {asin}")

        # Amazon serves several block-page variants when it detects automation.
        # The "dog page" carries an API support email; the soft challenge shows
        # a captcha validation URL or asks the user to type characters.
        body_lower = response.text.lower()
        if "api-services-support@amazon.com" in response.text:
            raise RetryableError(f"Amazon dog-page CAPTCHA for {asin}")
        if "/errors/validatecaptcha" in body_lower or "type the characters you see" in body_lower:
            raise RetryableError(f"Soft CAPTCHA challenge for {asin}")

        return response.text
class AmazonPriceScraper:
    def __init__(self):
        self._proxy_pool = cycle(PROXIES)

    def _get_next_proxy(self):
        return next(self._proxy_pool).url

    @retry(
        stop=stop_after_attempt(MAX_RETRIES),
        wait=wait_random(min=3, max=10),
        retry=retry_if_exception_type(RetryableError),
    )
    def fetch_product_page(self, asin):
        url = f"<https://www.amazon.com/dp/{asin}>"
        proxy = self._get_next_proxy()

        response = curl_requests.get(
            url, proxy=proxy, timeout=REQUEST_TIMEOUT, impersonate="chrome",
        )

        # 404 means the product page is gone. Skip it without retrying.
        if response.status_code == 404:
            logger.warning(f"Product {asin} not found (404)")
            return None

        # 429 (rate limited) and 5xx (server errors) are temporary. Retry.
        if response.status_code == 429:
            raise RetryableError(f"Rate limited (429) for {asin}")
        if 500 <= response.status_code < 600:
            raise RetryableError(f"Server error {response.status_code} for {asin}")

        # Other 4xx codes (403 Forbidden, 410 Gone, and so on) are permanent.
        # Raise a plain Exception so the retry loop stops at the first attempt.
        if response.status_code != 200:
            raise Exception(f"Permanent HTTP error {response.status_code} for {asin}")

        # Amazon serves several block-page variants when it detects automation.
        # The "dog page" carries an API support email; the soft challenge shows
        # a captcha validation URL or asks the user to type characters.
        body_lower = response.text.lower()
        if "api-services-support@amazon.com" in response.text:
            raise RetryableError(f"Amazon dog-page CAPTCHA for {asin}")
        if "/errors/validatecaptcha" in body_lower or "type the characters you see" in body_lower:
            raise RetryableError(f"Soft CAPTCHA challenge for {asin}")

        return response.text

What matters here:

ISP proxies reduce but do not eliminate Amazon's anti-bot challenges. Expect occasional CAPTCHAs and 429s even with high-quality consumer-IP proxies; the retry layer below is what makes the monitor reliable across them.

  • impersonate="chrome" is the parameter that matches the TLS fingerprint to actual Chrome traffic. Without this parameter (or using the standard requests library), the failure rate returns to the 2-of-3 CAPTCHA rate measured in Step 1.

  • The retry policy distinguishes failure types. Only RetryableError (5xx responses, 429 rate limits, and CAPTCHA pages) is retried. A 404 returns None immediately. A 4xx response other than 404 or 429 raises a plain Exception that stops the retry loop. Each retry advances the proxy cycle, so a temporary failure on one IP does not keep retrying through the same one. Retrying a 403 Forbidden across three proxies has no benefit.

  • CAPTCHA detection. Amazon serves at least two block-page variants. The "dog page" (Amazon's well-known block page with a photo of one of their office dogs) contains the Amazon API support email; the soft challenge includes either a captcha validation URL or the phrase "type the characters you see". Production code routes each variant to different metrics; for this tutorial, both trigger a retry.

Parsing and the entry point

    def parse_price_from_json(self, soup):
        """Read the price from the embedded purchase-options JSON data."""
        wrapper = soup.select_one(PRICE_JSON_SELECTOR)
        if wrapper is None:
            return None
        try:
            data = json.loads(wrapper.get_text())
        except (json.JSONDecodeError, ValueError):
            return None

        offers = data.get("desktop_buybox_group_1", [])
        if not offers:
            return None

        # Prefer a NEW offer when one is listed; otherwise use the first offer.
        for offer in offers:
            if offer.get("buyingOptionType") == "NEW" and "priceAmount" in offer:
                return float(offer["priceAmount"])
        if "priceAmount" in offers[0]:
            return float(offers[0]["priceAmount"])
        return None

    def parse_price(self, soup):
        # Try the JSON data first; it is the most reliable source on modern pages.
        price = self.parse_price_from_json(soup)
        if price is not None:
            return price

        # Fall back to CSS selectors used in the current Amazon DOM.
        for selector in PRICE_SELECTORS:
            price = extract_price_text(soup.select_one(selector))
            if price is not None:
                return price

        # Older Amazon pages split the price into a whole-number span and a
        # fraction span. Combine them as a fallback.
        price_whole = soup.select_one("span.a-price-whole")
        if price_whole:
            whole = price_whole.get_text(strip=True).replace(",", "").rstrip(".")
            frac_tag = soup.select_one("span.a-price-fraction")
            fraction = frac_tag.get_text(strip=True) if frac_tag else "00"
            try:
                return float(f"{whole}.{fraction}")
            except ValueError:
                pass

        # Last resort: scan offscreen text for any visible dollar amount.
        for tag in soup.select(".a-offscreen"):
            text = tag.get_text(strip=True)
            if re.match(r"^\\$[\\d,]+\\.\\d{2}$", text):
                price = extract_price_text(tag)
                if price is not None:
                    return price

        return None

    def parse_product_info(self, html, asin):
        soup = BeautifulSoup(html, "lxml")

        title_tag = soup.select_one("#productTitle")
        title = title_tag.get_text(strip=True) if title_tag else "Unknown"

        price = self.parse_price(soup)

        avail_tag = soup.select_one("#availability span")
        if avail_tag is None:
            oos_tag = soup.select_one("#outOfStockBuyBox_feature_div")
            availability = "Out of Stock" if oos_tag else "Unknown"
        else:
            availability = avail_tag.get_text(strip=True)

        rating = None
        rating_tag = soup.select_one("#acrPopover")
        if rating_tag:
            title_attr = rating_tag.get("title", "")
            if isinstance(title_attr, str):
                rating = title_attr.split(" out")[0]

        return PriceResult(
            asin=asin, title=title, price=price,
            availability=availability, rating=rating,
        )

    def get_price(self, asin):
				# A random delay between requests avoids unnaturally uniform timing, 
				# which can look like a script and trigger challenges.
        time.sleep(random.uniform(3, 7))

        try:
            html = self.fetch_product_page(asin)
        except Exception as e:
            logger.error(f"Fetch failed for {asin}: {e}")
            return None

        if html is None:
            return None
        return self.parse_product_info(html, asin)


if __name__ == "__main__":
    import sys
    logger.remove()
    logger.add(sys.stderr, format="{time:YYYY-MM-DD HH:mm:ss} | {level: <8} | {message}")
    logger.add("scraper.log", rotation="10 MB")

    scraper = AmazonPriceScraper()
    result = scraper.get_price("B07MHJFRBJ")

    if result:
        logger.success(f"{result.title[:60]}")
        logger.success(f"Price: ${result.price} | {result.availability} | Rating: {result.rating}")
    else:
        logger.error("Failed to get product info")
    def parse_price_from_json(self, soup):
        """Read the price from the embedded purchase-options JSON data."""
        wrapper = soup.select_one(PRICE_JSON_SELECTOR)
        if wrapper is None:
            return None
        try:
            data = json.loads(wrapper.get_text())
        except (json.JSONDecodeError, ValueError):
            return None

        offers = data.get("desktop_buybox_group_1", [])
        if not offers:
            return None

        # Prefer a NEW offer when one is listed; otherwise use the first offer.
        for offer in offers:
            if offer.get("buyingOptionType") == "NEW" and "priceAmount" in offer:
                return float(offer["priceAmount"])
        if "priceAmount" in offers[0]:
            return float(offers[0]["priceAmount"])
        return None

    def parse_price(self, soup):
        # Try the JSON data first; it is the most reliable source on modern pages.
        price = self.parse_price_from_json(soup)
        if price is not None:
            return price

        # Fall back to CSS selectors used in the current Amazon DOM.
        for selector in PRICE_SELECTORS:
            price = extract_price_text(soup.select_one(selector))
            if price is not None:
                return price

        # Older Amazon pages split the price into a whole-number span and a
        # fraction span. Combine them as a fallback.
        price_whole = soup.select_one("span.a-price-whole")
        if price_whole:
            whole = price_whole.get_text(strip=True).replace(",", "").rstrip(".")
            frac_tag = soup.select_one("span.a-price-fraction")
            fraction = frac_tag.get_text(strip=True) if frac_tag else "00"
            try:
                return float(f"{whole}.{fraction}")
            except ValueError:
                pass

        # Last resort: scan offscreen text for any visible dollar amount.
        for tag in soup.select(".a-offscreen"):
            text = tag.get_text(strip=True)
            if re.match(r"^\\$[\\d,]+\\.\\d{2}$", text):
                price = extract_price_text(tag)
                if price is not None:
                    return price

        return None

    def parse_product_info(self, html, asin):
        soup = BeautifulSoup(html, "lxml")

        title_tag = soup.select_one("#productTitle")
        title = title_tag.get_text(strip=True) if title_tag else "Unknown"

        price = self.parse_price(soup)

        avail_tag = soup.select_one("#availability span")
        if avail_tag is None:
            oos_tag = soup.select_one("#outOfStockBuyBox_feature_div")
            availability = "Out of Stock" if oos_tag else "Unknown"
        else:
            availability = avail_tag.get_text(strip=True)

        rating = None
        rating_tag = soup.select_one("#acrPopover")
        if rating_tag:
            title_attr = rating_tag.get("title", "")
            if isinstance(title_attr, str):
                rating = title_attr.split(" out")[0]

        return PriceResult(
            asin=asin, title=title, price=price,
            availability=availability, rating=rating,
        )

    def get_price(self, asin):
				# A random delay between requests avoids unnaturally uniform timing, 
				# which can look like a script and trigger challenges.
        time.sleep(random.uniform(3, 7))

        try:
            html = self.fetch_product_page(asin)
        except Exception as e:
            logger.error(f"Fetch failed for {asin}: {e}")
            return None

        if html is None:
            return None
        return self.parse_product_info(html, asin)


if __name__ == "__main__":
    import sys
    logger.remove()
    logger.add(sys.stderr, format="{time:YYYY-MM-DD HH:mm:ss} | {level: <8} | {message}")
    logger.add("scraper.log", rotation="10 MB")

    scraper = AmazonPriceScraper()
    result = scraper.get_price("B07MHJFRBJ")

    if result:
        logger.success(f"{result.title[:60]}")
        logger.success(f"Price: ${result.price} | {result.availability} | Rating: {result.rating}")
    else:
        logger.error("Failed to get product info")
    def parse_price_from_json(self, soup):
        """Read the price from the embedded purchase-options JSON data."""
        wrapper = soup.select_one(PRICE_JSON_SELECTOR)
        if wrapper is None:
            return None
        try:
            data = json.loads(wrapper.get_text())
        except (json.JSONDecodeError, ValueError):
            return None

        offers = data.get("desktop_buybox_group_1", [])
        if not offers:
            return None

        # Prefer a NEW offer when one is listed; otherwise use the first offer.
        for offer in offers:
            if offer.get("buyingOptionType") == "NEW" and "priceAmount" in offer:
                return float(offer["priceAmount"])
        if "priceAmount" in offers[0]:
            return float(offers[0]["priceAmount"])
        return None

    def parse_price(self, soup):
        # Try the JSON data first; it is the most reliable source on modern pages.
        price = self.parse_price_from_json(soup)
        if price is not None:
            return price

        # Fall back to CSS selectors used in the current Amazon DOM.
        for selector in PRICE_SELECTORS:
            price = extract_price_text(soup.select_one(selector))
            if price is not None:
                return price

        # Older Amazon pages split the price into a whole-number span and a
        # fraction span. Combine them as a fallback.
        price_whole = soup.select_one("span.a-price-whole")
        if price_whole:
            whole = price_whole.get_text(strip=True).replace(",", "").rstrip(".")
            frac_tag = soup.select_one("span.a-price-fraction")
            fraction = frac_tag.get_text(strip=True) if frac_tag else "00"
            try:
                return float(f"{whole}.{fraction}")
            except ValueError:
                pass

        # Last resort: scan offscreen text for any visible dollar amount.
        for tag in soup.select(".a-offscreen"):
            text = tag.get_text(strip=True)
            if re.match(r"^\\$[\\d,]+\\.\\d{2}$", text):
                price = extract_price_text(tag)
                if price is not None:
                    return price

        return None

    def parse_product_info(self, html, asin):
        soup = BeautifulSoup(html, "lxml")

        title_tag = soup.select_one("#productTitle")
        title = title_tag.get_text(strip=True) if title_tag else "Unknown"

        price = self.parse_price(soup)

        avail_tag = soup.select_one("#availability span")
        if avail_tag is None:
            oos_tag = soup.select_one("#outOfStockBuyBox_feature_div")
            availability = "Out of Stock" if oos_tag else "Unknown"
        else:
            availability = avail_tag.get_text(strip=True)

        rating = None
        rating_tag = soup.select_one("#acrPopover")
        if rating_tag:
            title_attr = rating_tag.get("title", "")
            if isinstance(title_attr, str):
                rating = title_attr.split(" out")[0]

        return PriceResult(
            asin=asin, title=title, price=price,
            availability=availability, rating=rating,
        )

    def get_price(self, asin):
				# A random delay between requests avoids unnaturally uniform timing, 
				# which can look like a script and trigger challenges.
        time.sleep(random.uniform(3, 7))

        try:
            html = self.fetch_product_page(asin)
        except Exception as e:
            logger.error(f"Fetch failed for {asin}: {e}")
            return None

        if html is None:
            return None
        return self.parse_product_info(html, asin)


if __name__ == "__main__":
    import sys
    logger.remove()
    logger.add(sys.stderr, format="{time:YYYY-MM-DD HH:mm:ss} | {level: <8} | {message}")
    logger.add("scraper.log", rotation="10 MB")

    scraper = AmazonPriceScraper()
    result = scraper.get_price("B07MHJFRBJ")

    if result:
        logger.success(f"{result.title[:60]}")
        logger.success(f"Price: ${result.price} | {result.availability} | Rating: {result.rating}")
    else:
        logger.error("Failed to get product info")

parse_price applies strategies in order: embedded JSON first (the stable path), then three CSS-based fallbacks. Standard product pages are typically handled correctly by this chain.

An edge case neither parser strategy handles. Some product pages require the user to select a variant before a price renders. Color-picker electronics and certain Echo or Kindle hardware are common examples. For those products, both the JSON blob and the CSS price are empty, and the scraper returns None. Those products require a headless browser such as Nodriver.

get_price adds a 3 to 7 second human-pacing delay before each fetch. The randomization is intentional: a uniform interval is itself a fingerprint, so a random window within a range stops the request pattern from being predictable.

Run it to confirm the scraper works:

python scraper.py
python scraper.py
python scraper.py

Expected output:

2026-04-25 10:32:14 | SUCCESS  | Bounty Quick Size Paper Towels, White, 8 Family Rolls
2026-04-25 10:32:14 | SUCCESS  | Price: $24.42 | In Stock | Rating: 4.8
2026-04-25 10:32:14 | SUCCESS  | Bounty Quick Size Paper Towels, White, 8 Family Rolls
2026-04-25 10:32:14 | SUCCESS  | Price: $24.42 | In Stock | Rating: 4.8
2026-04-25 10:32:14 | SUCCESS  | Bounty Quick Size Paper Towels, White, 8 Family Rolls
2026-04-25 10:32:14 | SUCCESS  | Price: $24.42 | In Stock | Rating: 4.8

If you see a price and a title, the scraper is working correctly.

Step 5 – Build the monitor

The monitor loops over your products on a schedule, stores price history in TinyDB, and triggers an alert when a price hits your target.

# monitor.py

import json
import time
import schedule
from loguru import logger
from pydantic import TypeAdapter
from tinydb import TinyDB, Query
from scraper import AmazonPriceScraper
from config import ProductConfig
from alerts import send_alert


ProductList = TypeAdapter(list[ProductConfig])

def load_products(path="products.json"):
    with open(path) as f:
        data = json.load(f)
    return ProductList.validate_python(data["products"])


class PriceMonitor:
    def __init__(self, products_file="products.json"):
        self.products = load_products(products_file)
        self.scraper = AmazonPriceScraper()
        self.db = TinyDB("price_history.json")
        logger.info(f"Monitoring {len(self.products)} products")

    def check_prices(self):
        logger.info("Starting price check cycle...")
        P = Query()

        for product in self.products:
            result = self.scraper.get_price(product.asin)

            if result and result.price is not None:
                self.db.insert(result.model_dump(mode="json"))

                history = self.db.search(P.asin == product.asin)
                prices = [r["price"] for r in history if r["price"] is not None]

                current = result.price
                lowest = min(prices)
                highest = max(prices)

                if current <= product.target_price:
                    logger.success(f"TARGET HIT! {product.name} – ${current:.2f} (target: ${product.target_price:.2f})")
                    send_alert(result, product)
                else:
                    logger.info(f"{product.name} – ${current:.2f} (low: ${lowest:.2f}, high: ${highest:.2f})")
            else:
                logger.warning(f"Failed to get price for {product.name} ({product.asin})")

    def run(self, interval_minutes=60):
        logger.info(f"Check interval: every {interval_minutes} minutes")

        schedule.every(interval_minutes).minutes.do(self.check_prices)
        self.check_prices()  # Run first check immediately

        try:
            while True:
                schedule.run_pending()
                time.sleep(1)
        except KeyboardInterrupt:
            logger.info("Monitor stopped by user (Ctrl+C)")


if __name__ == "__main__":
    import sys
    logger.remove()
    logger.add(sys.stderr, format="{time:YYYY-MM-DD HH:mm:ss} | {level: <8} | {message}")
    logger.add("monitor.log", rotation="10 MB", retention="30 days")

    monitor = PriceMonitor()
    monitor.run(interval_minutes=30)
# monitor.py

import json
import time
import schedule
from loguru import logger
from pydantic import TypeAdapter
from tinydb import TinyDB, Query
from scraper import AmazonPriceScraper
from config import ProductConfig
from alerts import send_alert


ProductList = TypeAdapter(list[ProductConfig])

def load_products(path="products.json"):
    with open(path) as f:
        data = json.load(f)
    return ProductList.validate_python(data["products"])


class PriceMonitor:
    def __init__(self, products_file="products.json"):
        self.products = load_products(products_file)
        self.scraper = AmazonPriceScraper()
        self.db = TinyDB("price_history.json")
        logger.info(f"Monitoring {len(self.products)} products")

    def check_prices(self):
        logger.info("Starting price check cycle...")
        P = Query()

        for product in self.products:
            result = self.scraper.get_price(product.asin)

            if result and result.price is not None:
                self.db.insert(result.model_dump(mode="json"))

                history = self.db.search(P.asin == product.asin)
                prices = [r["price"] for r in history if r["price"] is not None]

                current = result.price
                lowest = min(prices)
                highest = max(prices)

                if current <= product.target_price:
                    logger.success(f"TARGET HIT! {product.name} – ${current:.2f} (target: ${product.target_price:.2f})")
                    send_alert(result, product)
                else:
                    logger.info(f"{product.name} – ${current:.2f} (low: ${lowest:.2f}, high: ${highest:.2f})")
            else:
                logger.warning(f"Failed to get price for {product.name} ({product.asin})")

    def run(self, interval_minutes=60):
        logger.info(f"Check interval: every {interval_minutes} minutes")

        schedule.every(interval_minutes).minutes.do(self.check_prices)
        self.check_prices()  # Run first check immediately

        try:
            while True:
                schedule.run_pending()
                time.sleep(1)
        except KeyboardInterrupt:
            logger.info("Monitor stopped by user (Ctrl+C)")


if __name__ == "__main__":
    import sys
    logger.remove()
    logger.add(sys.stderr, format="{time:YYYY-MM-DD HH:mm:ss} | {level: <8} | {message}")
    logger.add("monitor.log", rotation="10 MB", retention="30 days")

    monitor = PriceMonitor()
    monitor.run(interval_minutes=30)
# monitor.py

import json
import time
import schedule
from loguru import logger
from pydantic import TypeAdapter
from tinydb import TinyDB, Query
from scraper import AmazonPriceScraper
from config import ProductConfig
from alerts import send_alert


ProductList = TypeAdapter(list[ProductConfig])

def load_products(path="products.json"):
    with open(path) as f:
        data = json.load(f)
    return ProductList.validate_python(data["products"])


class PriceMonitor:
    def __init__(self, products_file="products.json"):
        self.products = load_products(products_file)
        self.scraper = AmazonPriceScraper()
        self.db = TinyDB("price_history.json")
        logger.info(f"Monitoring {len(self.products)} products")

    def check_prices(self):
        logger.info("Starting price check cycle...")
        P = Query()

        for product in self.products:
            result = self.scraper.get_price(product.asin)

            if result and result.price is not None:
                self.db.insert(result.model_dump(mode="json"))

                history = self.db.search(P.asin == product.asin)
                prices = [r["price"] for r in history if r["price"] is not None]

                current = result.price
                lowest = min(prices)
                highest = max(prices)

                if current <= product.target_price:
                    logger.success(f"TARGET HIT! {product.name} – ${current:.2f} (target: ${product.target_price:.2f})")
                    send_alert(result, product)
                else:
                    logger.info(f"{product.name} – ${current:.2f} (low: ${lowest:.2f}, high: ${highest:.2f})")
            else:
                logger.warning(f"Failed to get price for {product.name} ({product.asin})")

    def run(self, interval_minutes=60):
        logger.info(f"Check interval: every {interval_minutes} minutes")

        schedule.every(interval_minutes).minutes.do(self.check_prices)
        self.check_prices()  # Run first check immediately

        try:
            while True:
                schedule.run_pending()
                time.sleep(1)
        except KeyboardInterrupt:
            logger.info("Monitor stopped by user (Ctrl+C)")


if __name__ == "__main__":
    import sys
    logger.remove()
    logger.add(sys.stderr, format="{time:YYYY-MM-DD HH:mm:ss} | {level: <8} | {message}")
    logger.add("monitor.log", rotation="10 MB", retention="30 days")

    monitor = PriceMonitor()
    monitor.run(interval_minutes=30)

monitor.py imports send_alert from alerts.py, which is created in Step 6. Even though the code ends with a __main__ block that calls monitor.run(...), do not run python monitor.py until Step 6 is complete; the import will fail with ImportError until alerts.py exists.

Step 6 – Add price-drop alerts

apprise is a Python library that sends notifications to over 100 services through a single URL-based interface. Each notification channel (Slack, Discord, email, Telegram, and others) is configured by adding one URL per channel.

Each channel requires a one-time setup to get the webhook URL:

  • Slack. Go to api.slack.com/appsCreate New AppFrom Scratch → pick your workspace → Features → Incoming Webhooks → toggle on → Add New Webhook to Workspace → choose a channel and authorize. Convert the resulting https://hooks.slack.com/services/T.../B.../... URL to Apprise format slack://T.../B.../....

  • Discord. Server Settings → Integrations → Webhooks → New Webhook, then format as discord://webhook_id/webhook_token.

  • Gmail. Enable 2FA, generate an app password, and use that (not your standard account password) in the mailto:// URL.

Full list of services and URL formats: Apprise wiki.

# alerts.py

import apprise
from loguru import logger
from scraper import PriceResult
from config import ProductConfig

notifier = apprise.Apprise()

# Configure at least one channel. This example uses Slack.
# Replace TokenA/TokenB/TokenC with the segments from your real webhook URL
# before running the verification step. Otherwise alerts will silently fail.
notifier.add("slack://TokenA/TokenB/TokenC")

# Other channels you can add (uncomment as needed):
# notifier.add("discord://webhook_id/webhook_token")
# notifier.add("tgram://bot_token/chat_id")
# notifier.add("mailto://your_user:app_password@gmail.com?to=you@gmail.com")
# See <https://github.com/caronc/apprise/wiki> for the full list of services.


def send_alert(result: PriceResult, product: ProductConfig):
    title = f"Price Drop: {result.title[:50]}"
    body = (
        f"Product: {result.title}\\n"
        f"Current Price: ${result.price:.2f}\\n"
        f"Target Price: ${product.target_price:.2f}\\n"
        f"You Save: ${product.target_price - result.price:.2f}\\n"
        f"\\nhttps://www.amazon.com/dp/{result.asin}\\n"
    )

    if len(notifier) > 0:
        if notifier.notify(title=title, body=body):
            logger.success(f"Alert sent for {result.asin} – ${result.price:.2f}")
        else:
            logger.error(f"Alert delivery failed for {result.asin} – ${result.price:.2f}")
    else:
        logger.warning(f"No notification services configured! Price alert: {title}")
# alerts.py

import apprise
from loguru import logger
from scraper import PriceResult
from config import ProductConfig

notifier = apprise.Apprise()

# Configure at least one channel. This example uses Slack.
# Replace TokenA/TokenB/TokenC with the segments from your real webhook URL
# before running the verification step. Otherwise alerts will silently fail.
notifier.add("slack://TokenA/TokenB/TokenC")

# Other channels you can add (uncomment as needed):
# notifier.add("discord://webhook_id/webhook_token")
# notifier.add("tgram://bot_token/chat_id")
# notifier.add("mailto://your_user:app_password@gmail.com?to=you@gmail.com")
# See <https://github.com/caronc/apprise/wiki> for the full list of services.


def send_alert(result: PriceResult, product: ProductConfig):
    title = f"Price Drop: {result.title[:50]}"
    body = (
        f"Product: {result.title}\\n"
        f"Current Price: ${result.price:.2f}\\n"
        f"Target Price: ${product.target_price:.2f}\\n"
        f"You Save: ${product.target_price - result.price:.2f}\\n"
        f"\\nhttps://www.amazon.com/dp/{result.asin}\\n"
    )

    if len(notifier) > 0:
        if notifier.notify(title=title, body=body):
            logger.success(f"Alert sent for {result.asin} – ${result.price:.2f}")
        else:
            logger.error(f"Alert delivery failed for {result.asin} – ${result.price:.2f}")
    else:
        logger.warning(f"No notification services configured! Price alert: {title}")
# alerts.py

import apprise
from loguru import logger
from scraper import PriceResult
from config import ProductConfig

notifier = apprise.Apprise()

# Configure at least one channel. This example uses Slack.
# Replace TokenA/TokenB/TokenC with the segments from your real webhook URL
# before running the verification step. Otherwise alerts will silently fail.
notifier.add("slack://TokenA/TokenB/TokenC")

# Other channels you can add (uncomment as needed):
# notifier.add("discord://webhook_id/webhook_token")
# notifier.add("tgram://bot_token/chat_id")
# notifier.add("mailto://your_user:app_password@gmail.com?to=you@gmail.com")
# See <https://github.com/caronc/apprise/wiki> for the full list of services.


def send_alert(result: PriceResult, product: ProductConfig):
    title = f"Price Drop: {result.title[:50]}"
    body = (
        f"Product: {result.title}\\n"
        f"Current Price: ${result.price:.2f}\\n"
        f"Target Price: ${product.target_price:.2f}\\n"
        f"You Save: ${product.target_price - result.price:.2f}\\n"
        f"\\nhttps://www.amazon.com/dp/{result.asin}\\n"
    )

    if len(notifier) > 0:
        if notifier.notify(title=title, body=body):
            logger.success(f"Alert sent for {result.asin} – ${result.price:.2f}")
        else:
            logger.error(f"Alert delivery failed for {result.asin} – ${result.price:.2f}")
    else:
        logger.warning(f"No notification services configured! Price alert: {title}")

When the monitor sends an alert, it appears in the configured channel:

A live alert captured during testing: the Stanley Quencher Tumbler dropped from $34.99 to $25.00, a savings of $9.99 (28.55%).

Verify the alert channel before relying on it

Here is how this can fail: a price drop happens, the monitor logs Alert sent, but nothing arrives in Slack because the URL was malformed. The problem is only discovered days later. This is a common silent-failure pattern: logs that report "success" while data is dropped without alarms. A 60-second verification step prevents this:

python -c "from alerts import notifier; notifier.notify(title='Amazon monitor – wiring test', body='If you see this, alerts are wired up correctly.')"
python -c "from alerts import notifier; notifier.notify(title='Amazon monitor – wiring test', body='If you see this, alerts are wired up correctly.')"
python -c "from alerts import notifier; notifier.notify(title='Amazon monitor – wiring test', body='If you see this, alerts are wired up correctly.')"

If the test message appears in your channel, the configuration is correct. If not, run apprise -vv -t 'test' -b'body' 'your-url-here'; the verbose flag identifies which part of the URL the library is rejecting.

Step 7 – Run the monitor

python monitor.py
python monitor.py
python monitor.py

Expected output on the first cycle (no targets reached yet):

2026-04-25 10:45:01 | INFO     | Monitoring 4 products
2026-04-25 10:45:01 | INFO     | Check interval: every 30 minutes
2026-04-25 10:45:01 | INFO     | Starting price check cycle...
2026-04-25 10:45:08 | INFO     | Bounty Paper Towels $24.42 (low: $24.42, high: $24.42)
2026-04-25 10:45:14 | INFO     | Amazon Basics AA Batteries $9.99 (low: $9.99, high: $9.99)
2026-04-25 10:45:21 | INFO     | Stanley Quencher Tumbler $34.99 (low: $34.99, high: $34.99)
2026-04-25 10:45:28 | INFO     | Owala FreeSip Water Bottle $29.99 (low: $29.99, high: $29.99)
2026-04-25 10:45:01 | INFO     | Monitoring 4 products
2026-04-25 10:45:01 | INFO     | Check interval: every 30 minutes
2026-04-25 10:45:01 | INFO     | Starting price check cycle...
2026-04-25 10:45:08 | INFO     | Bounty Paper Towels $24.42 (low: $24.42, high: $24.42)
2026-04-25 10:45:14 | INFO     | Amazon Basics AA Batteries $9.99 (low: $9.99, high: $9.99)
2026-04-25 10:45:21 | INFO     | Stanley Quencher Tumbler $34.99 (low: $34.99, high: $34.99)
2026-04-25 10:45:28 | INFO     | Owala FreeSip Water Bottle $29.99 (low: $29.99, high: $29.99)
2026-04-25 10:45:01 | INFO     | Monitoring 4 products
2026-04-25 10:45:01 | INFO     | Check interval: every 30 minutes
2026-04-25 10:45:01 | INFO     | Starting price check cycle...
2026-04-25 10:45:08 | INFO     | Bounty Paper Towels $24.42 (low: $24.42, high: $24.42)
2026-04-25 10:45:14 | INFO     | Amazon Basics AA Batteries $9.99 (low: $9.99, high: $9.99)
2026-04-25 10:45:21 | INFO     | Stanley Quencher Tumbler $34.99 (low: $34.99, high: $34.99)
2026-04-25 10:45:28 | INFO     | Owala FreeSip Water Bottle $29.99 (low: $29.99, high: $29.99)

The low and high values are the same on this first cycle because no earlier readings exist; later cycles will show the price history as it builds up.

When a price drops to or below your target, two extra log entries appear and a notification is sent to your configured channel:

2026-04-25 14:15:21 | SUCCESS  | TARGET HIT! Stanley Quencher Tumbler $19.99 (target: $20.00)
2026-04-25 14:15:21 | SUCCESS  | Alert sent for B0CP9YB3Q4 $19.99
2026-04-25 14:15:21 | SUCCESS  | TARGET HIT! Stanley Quencher Tumbler $19.99 (target: $20.00)
2026-04-25 14:15:21 | SUCCESS  | Alert sent for B0CP9YB3Q4 $19.99
2026-04-25 14:15:21 | SUCCESS  | TARGET HIT! Stanley Quencher Tumbler $19.99 (target: $20.00)
2026-04-25 14:15:21 | SUCCESS  | Alert sent for B0CP9YB3Q4 $19.99

The monitor runs an immediate first check, then every 30 minutes. Price history builds up in price_history.json; logs are written to monitor.log. Press Ctrl+C to stop.

Troubleshooting

Getting CAPTCHAs? Increase the request delay. The 3 to 7 second delay is the minimum; reducing it uses up proxy reputation faster. If CAPTCHAs appear on the first request, your proxies might be datacenter IPs labeled as residential. Diagnostic checks:

Test

Real residential

Datacenter or fake

Fraud-score API

Low score

High score

Reverse DNS

ISP hostname (Comcast, Verizon, AT&T)

Hosting provider (AWS, OVH, DigitalOcean)

Geolocation consistency

Stays in one region

Jumps between regions

Several free tools do these checks. The HypeProxies proxy checker returns fraud score, ASN, hostname, geolocation, and VPN/Tor detection in one result. IPQualityScore and Whoer are also common options.

Prices not parsing? Variant products (iPads, phones with size/color pickers) sometimes load the price through JavaScript rather than embedding it in the static HTML, which means both the JSON blob and the CSS fallback chain return None. Either skip those ASINs or use a headless browser such as Nodriver for that subset. The edge case is explained in detail in Step 4 – Build the scraper.

All proxies timing out? First, make sure the proxy account is active and not suspended for billing or usage limits. If the account is active, check https://www.amazon.com/robots.txt directly through one proxy with curl_cffi. If that times out, the proxy provider has a routing issue. If the request works, your firewall is likely blocking outbound traffic on the proxy port.

Monitor stops finding new prices over time? Amazon updates the page DOM from time to time. If both the JSON parser and the CSS fallback chain return None, look at a fresh page in DevTools. Either the JSON wrapper class has been renamed (update PRICE_JSON_SELECTOR) or the CSS classes have changed (add a new entry to PRICE_SELECTORS).

Still stuck after working through these checks? Post the question on r/webscraping for general scraping and anti-bot help, or check the curl_cffi issue tracker for TLS-fingerprinting bugs. If you are using HypeProxies and the issue looks proxy-related, the HypeProxies Discord is the fastest channel. Share what you are seeing and a snippet of your scraper output.

Deploy continuously

The monitor needs to run continuously; a personal machine is not best for this. For most users, a $5 per month VPS (Hetzner, DigitalOcean, or Linode) with systemd is the simplest setup: around 10 minutes of configuration, and the service survives reboots. Free alternatives exist (Oracle Cloud's Always Free ARM instance, or restructuring as a GitHub Actions cron job), but they require more setup time.

systemd unit file

Copy the project to /opt/amazon-monitor on the VPS and create a Python virtual environment at /opt/amazon-monitor/.venv. Then create /etc/systemd/system/amazon-monitor.service:

[Unit]
Description=Amazon Price Monitor
After=network-online.target
Wants=network-online.target

[Service]
Type=simple
User=monitor
WorkingDirectory=/opt/amazon-monitor
ExecStart=/opt/amazon-monitor/.venv/bin/python monitor.py
Restart=on-failure
RestartSec=30

[Install]
WantedBy=multi-user.target
[Unit]
Description=Amazon Price Monitor
After=network-online.target
Wants=network-online.target

[Service]
Type=simple
User=monitor
WorkingDirectory=/opt/amazon-monitor
ExecStart=/opt/amazon-monitor/.venv/bin/python monitor.py
Restart=on-failure
RestartSec=30

[Install]
WantedBy=multi-user.target
[Unit]
Description=Amazon Price Monitor
After=network-online.target
Wants=network-online.target

[Service]
Type=simple
User=monitor
WorkingDirectory=/opt/amazon-monitor
ExecStart=/opt/amazon-monitor/.venv/bin/python monitor.py
Restart=on-failure
RestartSec=30

[Install]
WantedBy=multi-user.target

Enable and start it:

sudo systemctl daemon-reload
sudo systemctl enable --now amazon-monitor
sudo journalctl -u amazon-monitor -f   # follow the live log
sudo systemctl daemon-reload
sudo systemctl enable --now amazon-monitor
sudo journalctl -u amazon-monitor -f   # follow the live log
sudo systemctl daemon-reload
sudo systemctl enable --now amazon-monitor
sudo journalctl -u amazon-monitor -f   # follow the live log

Docker

For a Docker deployment, write a minimal Dockerfile that installs the dependencies and runs python monitor.py. Mount price_history.json as a volume so the data stays across container restarts:

docker run -d --name amazon-monitor \\
  -v $(pwd)/price_history.json:/app/price_history.json \\
  amazon-monitor:latest
docker run -d --name amazon-monitor \\
  -v $(pwd)/price_history.json:/app/price_history.json \\
  amazon-monitor:latest
docker run -d --name amazon-monitor \\
  -v $(pwd)/price_history.json:/app/price_history.json \\
  amazon-monitor:latest

GitHub Actions (source repository)

A GitHub Actions cron variant of this implementation is at github.com/triposat/amazon-price-monitor. Fork the repository and add your PROXIES and APPRISE_URLS as repository secrets. The workflow then runs on a 30-minute schedule.

A note on design differences and alert behavior. The blog uses a target-price alert model (one alert per cycle while the price stays at or below the configured target_price), while the source repository uses a drop-based model with a 6-hour cooldown (one alert per product per new 24-hour low). The blog's simpler model means a product that drops to its target will trigger a notification on every following cycle until the price recovers. If you do not want repeated alerts on your channel, either add cooldown logic to monitor.py or use the source repository implementation. Choose the model that matches the use case.

Scaling and adapting

You have a working monitor. The sequential ceiling is set by per-fetch latency:

Sequential cycle limits. Each product fetch takes around 10 seconds: 3 to 7 seconds of pacing delay, plus 3 to 4 seconds of network round-trip. The resulting throughput limits:

Check cadence

Maximum products (sequential)

Hourly

around 300

Every 2 hours

around 600

Every 6 hours

around 2,000

To exceed 300 products at hourly cadence, refactor the fetch loop with curl_cffi.AsyncSession and asyncio.gather().

Common next steps:

  • Scale up. With the async upgrade, around 50 ISP proxies running in parallel handle around 1,200 products at hourly cadence (the Pro entry-tier plan size at HypeProxies). Add IPs as the catalog grows. As cycles get closer to the interval ceiling, add a file lock (such as flock on Linux) to prevent overlapping cycles from writing to price_history.json concurrently; TinyDB is not designed for concurrent writes.

  • Tier your cadence. schedule.every(15).minutes.do(check_high_priority) for flash deals, schedule.every(6).hours.do(check_low_priority) for wishlist items.

  • Adapt to other eCommerce targets. The same stack works against most sites that serve static HTML pricing, such as Walmart, Target, Best Buy, eBay, and Costco. The site-specific changes are the item ID structure (what other sites call Amazon's ASIN) and the parser selectors. The TLS, proxy, and retry layers stay the same.

  • Adapt the stack for JavaScript-rendered prices. For variant pickers, coupons, or other prices that load through JavaScript, replace curl_cffi with a headless browser. The rest of the stack stays the same.

  • Adapt the database for larger catalogs. For 5,000 or more products at sub-hourly cadence, replace TinyDB with PostgreSQL and a queue.

  • Add a retention policy. Without pruning, price_history.json grows without limit. The source repository implementation prunes readings older than 30 days; add similar logic to monitor.py for long-running deployments.

If you do not have an ISP proxy pool yet, HypeProxies plans start at $65 per month for 50 IPs with unlimited bandwidth. This is enough for around 300 products sequentially at hourly cadence, or around 1,200 with the async upgrade.

Bottom line

Reliable Amazon price monitoring rests on two foundations: matching a browser's TLS fingerprint, and using IPs with consumer-ISP reputation. With those right, the codebase stays small. The work that compounds over time is in the alert thresholds: tune them so notifications stay signal, not noise.


Share on

$1 one-time verification. Unlock your trial today.

In this article:

Title

Stay in the loop

Subscribe to our newsletter for the latest updates, product news, and more.

No spam. Unsubscribe at anytime.

Fast static residential IPs

ISP proxies pricing

Quarterly

10% Off

Monthly

Best value

Pro

Balanced option for daily proxy needs

$1.30

/ IP

$1.16

/ IP

$65

/month

$58

/month

Quarterly

Cancel at anytime

Business

Built for scale and growing demand

$1.25

/ IP

$1.12

/ IP

$125

/month

$112

/month

Quarterly

Cancel at anytime

Enterprise

High-volume power for heavy users

$1.18

/ IP

$1.06

/ IP

$300

/month

$270

/month

Quarterly

Cancel at anytime

Proxies

Bandwidth

Threads

Speed

Support

50 IPs

Unlimited

Unlimited

10GBPS

Standard

100 IPs

Unlimited

Unlimited

10GBPS

Priority

254 IPs

Subnet

/24 private subnet
on dedicated servers

Unlimited

Unlimited

10GBPS

Dedicated

Crypto

Quarterly

10% Off

Monthly

Pro

Balanced option for daily proxy needs

$1.30

/ IP

$1.16

/ IP

$65

/month

$58

/month

Quarterly

Cancel at anytime

Get discount below

Proxies

50 IPs

Bandwidth

Unlimited

Threads

Unlimited

Speed

10GBPS

Support

Standard

Popular

Business

Built for scale and growing demand

$1.25

/ IP

$1.12

/ IP

$125

/month

$112

/month

Quarterly

Cancel at anytime

Get discount below

Proxies

100 IPs

Bandwidth

Unlimited

Threads

Unlimited

Speed

10GBPS

Support

Priority

Enterprise

High-volume power for heavy users

$1.18

/ IP

$1.06

/ IP

$300

/month

$270

/month

Quarterly

Cancel at anytime

Get discount below

Proxies

254 IPs

Subnet

/24 private subnet
on dedicated servers

Bandwidth

Unlimited

Threads

Unlimited

Speed

10GBPS

Support

Dedicated

Crypto

Quarterly

10% Off

Monthly

Pro

Balanced option for daily proxy needs

$1.30

/ IP

$1.16

/ IP

$65

/month

$58

/month

Quarterly

Cancel at anytime

Get discount below

Proxies

50 IPs

Bandwidth

Unlimited

Threads

Unlimited

Speed

10GBPS

Support

Standard

Popular

Business

Built for scale and growing demand

$1.25

/ IP

$1.12

/ IP

$125

/month

$112

/month

Quarterly

Cancel at anytime

Get discount below

Proxies

100 IPs

Bandwidth

Unlimited

Threads

Unlimited

Speed

10GBPS

Support

Priority

Enterprise

High-volume power for heavy users

$1.18

/ IP

$1.06

/ IP

$300

/month

$270

/month

Quarterly

Cancel at anytime

Get discount below

Proxies

254 IPs

Subnet

/24 private subnet
on dedicated servers

Bandwidth

Unlimited

Threads

Unlimited

Speed

10GBPS

Support

Dedicated

Crypto