Documentation Index
Fetch the complete documentation index at: https://mintlify.com/ScrapeGraphAI/Scrapegraph-ai/llms.txt
Use this file to discover all available pages before exploring further.
Installation
Get ScrapeGraphAI up and running in your Python environment. This guide covers installation for pip, virtual environment setup, and post-installation configuration.Requirements
Before installing ScrapeGraphAI, ensure you have:- Python 3.10 or higher (up to Python 3.12)
- pip package manager
- Internet connection for downloading dependencies
It is strongly recommended to install ScrapeGraphAI in a virtual environment to avoid conflicts with other libraries.
Installation Steps
Create a Virtual Environment (Recommended)
Create an isolated Python environment for your project:Alternatively, use
conda if you prefer:Install ScrapeGraphAI
Install the library using pip:This will install ScrapeGraphAI along with its core dependencies including:
langchainand related packagesbeautifulsoup4for HTML parsingplaywrightfor browser automationpydanticfor data validation- Other required dependencies
Install Playwright Browsers
This step is critical for fetching website content. Install Playwright browser binaries:This downloads Chromium, Firefox, and WebKit browsers needed for scraping dynamic websites.
Optional Dependencies
ScrapeGraphAI offers optional features that require additional packages:Burr Integration
For advanced workflow visualization and debugging:NVIDIA AI Integration
For using NVIDIA AI endpoints:OCR Support
For extracting text from images and PDFs:LLM Provider Setup
ScrapeGraphAI works with various LLM providers. You’ll need to set up at least one:OpenAI
- Get an API key from OpenAI Platform
- Set it as an environment variable:
.env file:
Ollama (Local Models)
- Install Ollama from ollama.com
- Download a model:
- Ensure Ollama is running:
Ollama runs locally and doesn’t require an API key, making it great for development and privacy-sensitive applications.
Other Providers
Environment Configuration
Using python-dotenv
Installpython-dotenv to manage environment variables:
.env file in your project root:
Troubleshooting
Import Errors
If you encounter import errors:Playwright Issues
If Playwright browsers are not found:Version Conflicts
If you have dependency conflicts:Telemetry
ScrapeGraphAI collects anonymous usage metrics to improve the library. To opt out:.env file:
Verification Script
Run this script to verify your installation is complete:Next Steps
Quick Start
Now that you have ScrapeGraphAI installed, build your first scraper!
