- Add helper to detect savings/combo/bundle containers
- Prioritize JSON-LD data as primary price source (most reliable)
- Skip price elements inside savings containers
- Add minimum price threshold to filter out discount amounts
Fixes issue where $189.99 bundle savings was extracted instead of
actual $675.59 product price for items like AMD Ryzen 9 9950X3D.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add AI extraction service supporting Anthropic (Claude) and OpenAI
- Add AI settings UI in Settings page with provider selection
- Add database migration for AI settings columns
- Integrate AI fallback into scraper when standard methods fail
- Add API endpoints for AI settings and test extraction
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Now extracts availability/stock status from JSON-LD structured data
(e.g., "https://schema.org/OutOfStock"). This fixes stock detection
for sites like Ubiquiti Store that provide availability in JSON-LD.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Some sites (like Ubiquiti Store) use the priceSpecification nested
format in their JSON-LD structured data instead of direct price
property. Now checks offer.priceSpecification.price as fallback.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
B&H Photo Video uses aggressive Cloudflare protection that blocks
headless browsers even with stealth plugins. Removing the site-specific
scraper for now. The Puppeteer fallback remains in place for other
sites with less aggressive protection.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The headless browser was being detected by Cloudflare and stuck on the
"Just a moment..." challenge page. Added puppeteer-extra with the stealth
plugin which patches browser fingerprinting to avoid bot detection. Also
added logic to wait for Cloudflare challenges to complete.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When HTTP requests are blocked with 403 (e.g., B&H Photo's Cloudflare
protection), the scraper now automatically retries using a headless
Chrome browser via Puppeteer. Also updated Dockerfile to include
Chromium dependencies for container deployment.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Parse JSON-LD structured data for price, name, image, availability
- Add fallback HTML selectors using data-selenium attributes
- Detect stock status from add-to-cart and notify buttons
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Parse Walmart's __NEXT_DATA__ JSON for accurate product data
- Extract price, name, image, and availability from embedded JSON
- Add fallback HTML selectors if JSON parsing fails
- Make stock status detection more conservative
- Avoid false "out of stock" from unrelated page text
- Only mark out of stock when explicitly indicated
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add multiple price selectors for robustness
- Combine dollar/cents from price-current element
- Add JSON-LD fallback for price extraction
- Add explicit stock status detection for Newegg
- Prevents false out-of-stock detection from generic detector
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add stock_status column to products table (in_stock/out_of_stock/unknown)
- Detect out-of-stock status on Amazon by checking:
- #availability text for "currently unavailable"
- #outOfStock element presence
- Missing "Add to Cart" button
- Add generic stock status detection for other sites
- Allow adding out-of-stock products (they just won't have a price)
- Update background scheduler to track stock status changes
- Display stock status badge in product list and detail pages
- Dim out-of-stock products in the dashboard
- Show "Currently Unavailable" badge instead of price when out of stock
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add detection for coupon/savings containers and skip prices within them
- Check parent elements for coupon-related IDs, classes, and text
- Add minimum price threshold of $2 (coupons are typically $1-5)
- Add fallback to parse Amazon's whole/fraction price format directly
- Increase findMostLikelyPrice threshold from $0.99 to $5
This fixes the issue where $1 coupon savings were being scraped
instead of the actual $25.99 product price.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Fix cheerio import to use named exports
- Add proper interfaces for JSON-LD data
- Fix type annotations for CheerioAPI
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>