Overview
This guide covers advanced configuration options for ScrapeGraphAI, including:- Proxy rotation and authentication
- Custom headers and user agents
- Timeout and retry settings
- Browser configuration
- Authentication and sessions
- Performance optimization
Proxy Configuration
Basic Proxy
Use a proxy server for scraping:This example is from:
examples/extras/proxy_rotation.pyAuthenticated Proxy
Use proxy with username and password:Rotating Proxies
Rotate between multiple proxies:Proxy Services Integration
- Bright Data
- Oxylabs
- SmartProxy
Browser Configuration
Headless Mode
Run browser in background (faster):Headless mode is 20-30% faster but you can’t see the browser. Use
headless: False for debugging.Browser Type
Choose browser backend:playwright- Default, best compatibilityundetected_chromedriver- Bypass bot detectionselenium- Legacy support
Undetected ChromeDriver
Bypass bot detection systems:From:
examples/extras/undected_playwright.pySlow Motion
Add delays between actions:From:
examples/extras/slow_mo.pyUse slow motion to avoid triggering rate limits or to debug scraping issues.
Authentication
Session Storage (Cookies)
Use saved browser sessions for authenticated scraping:From:
examples/extras/authenticated_playwright.pyCustom Headers
Add custom HTTP headers:Timeout Configuration
Page Load Timeout
Set maximum time to wait for page load:Navigation Timeout
Set timeout for navigation actions:External Services Integration
BrowserBase
Use BrowserBase for managed browser automation:From:
examples/extras/browser_base_integration.pyScrapeDo
Integrate with ScrapeDo proxy service:From:
examples/extras/scrape_do.pyPerformance Optimization
Enable Verbose Mode
See detailed execution logs:Force Mode
Force scraping even with errors:Reattempt on Failure
Automatically retry failed scrapes:HTML Mode
Work with pre-downloaded HTML:Custom Prompts
Additional Context
Provide extra context to the LLM:From:
examples/extras/custom_prompt.pyScreenshot Configuration
Take screenshots during scraping:Complete Advanced Example
Configuration Reference
Complete list of available options:Best Practices
Rotate Proxies
Use proxy rotation to avoid IP bans and rate limits.
Implement Retries
Always use retry logic with exponential backoff for production.
Use Slow Motion
Add delays to avoid triggering anti-bot systems.
Monitor Logs
Enable verbose mode during development to debug issues.
Next Steps
LLM Providers
Learn about all supported LLM providers
OpenAI Setup
Configure OpenAI models
