The most powerful Firefox extension for web scraping. Regex-based data extraction, interactive graph visualization, batch queuing, and one-click HuggingFace upload.
From simple text extraction to AI-powered structured data mining, WebScraper Pro handles it all.
Multi-strategy extraction from SPAs, shadow DOM, web components, JSON-LD, microdata, and dynamically-loaded content.
Extract video sources, embeds, posters, tracks, and subtitles. YouTube filtering toggle, audio capture, and image download.
NuExtract-2.0-2B for structured data. 9 built-in templates, custom JSON schemas, local regex fallback, batch processing.
Take screenshots, OCR with Tesseract, auto-next page. Works on KDE, Hyprland, GNOME, and Cinnamon desktops.
Interactive force-directed graph of all scraped domains with live physics, unique composite edge patterns, favicons, and SSDg diagrams.
One-click upload with automatic JSONL sharding, MLA/APA citations, version-aware README, and community dataset support.
Queue multiple URLs for background scraping. Auto-scan crawls pages with configurable rate limiting and domain filtering.
XSS sanitization, robots.txt compliance, PII/API key/slur filtering with configurable redaction modes.
Full CLI for data management, AI model serving, Parquet export, screenshot OCR, HuggingFace ops, and system diagnostics.
Install the extension and start scraping in four easy steps.
Open about:addons in Firefox, gear icon, "Install Add-on From File..."
Set up HuggingFace token, data format, and scraping preferences in Settings
Click the popup, select a mode (Full Page, Auto-Scan, Queue), and start collecting data
Your data, your format. Export to whatever works best for your workflow.
58+ commands for data management, AI serving, exporting, and system diagnostics.
# Install the CLI python install.py # Export data in various formats scrape export jsonl scrape export parquet --compression zstd # AI extraction server scrape ai.serve --gpu scrape ai.setup # Screenshot OCR (Linux) scrape ai.screenshot --pages 5 # HuggingFace operations scrape hf.push scrape hf.status # System info scrape env scrape doctor
Fast access to all scraping modes without opening the popup.