Skip to main content

Overview

Ollama allows you to run large language models locally on your machine. This means:
  • Free: No API costs
  • Private: Your data never leaves your computer
  • Fast: No network latency
  • Offline: Works without internet connection
Perfect for development, testing, or privacy-sensitive projects.

Prerequisites

1

Install Ollama

Download and install Ollama from ollama.ai:
# Download from ollama.ai or use Homebrew
brew install ollama
2

Start Ollama Server

ollama serve
The server will start on http://localhost:11434
3

Pull a Model

Download a model (e.g., Llama 3.2):
ollama pull llama3.2
First-time download may take a few minutes depending on model size.
4

Install ScrapeGraphAI

pip install scrapegraphai
playwright install

Basic Configuration

from scrapegraphai.graphs import SmartScraperGraph

graph_config = {
    "llm": {
        "model": "ollama/llama3.2",
        "temperature": 0,
        "model_tokens": 4096,
    },
    "verbose": True,
    "headless": False,
}

smart_scraper_graph = SmartScraperGraph(
    prompt="Find some information about the founders.",
    source="https://scrapegraphai.com/",
    config=graph_config,
)

result = smart_scraper_graph.run()
print(result)
This example is from: examples/smart_scraper_graph/ollama/smart_scraper_ollama.py

Configuration Options

Custom Base URL

If Ollama is running on a different host or port:
graph_config = {
    "llm": {
        "model": "ollama/llama3.2",
        "base_url": "http://192.168.1.100:11434",  # Remote Ollama server
        "temperature": 0,
    },
}

JSON Format Mode

Force JSON output (required for some models):
graph_config = {
    "llm": {
        "model": "ollama/mistral",
        "temperature": 0,
        "format": "json",  # Ollama needs format specified
    },
}
Some Ollama models require "format": "json" for structured output. If you get parsing errors, add this option.

Embeddings Configuration

Use local embeddings for better RAG performance:
graph_config = {
    "llm": {
        "model": "ollama/llama3.2",
        "temperature": 0,
    },
    "embeddings": {
        "model": "ollama/nomic-embed-text",
        "temperature": 0,
    },
}
# Pull the embedding model
ollama pull nomic-embed-text

Complete Examples

from scrapegraphai.graphs import SmartScraperGraph
from scrapegraphai.utils import prettify_exec_info

graph_config = {
    "llm": {
        "model": "ollama/llama3.2",
        "temperature": 0,
        "model_tokens": 4096,
    },
    "verbose": True,
    "headless": False,
}

smart_scraper_graph = SmartScraperGraph(
    prompt="Find some information about the founders.",
    source="https://scrapegraphai.com/",
    config=graph_config,
)

result = smart_scraper_graph.run()
print(result)

# Get execution info
graph_exec_info = smart_scraper_graph.get_execution_info()
print(prettify_exec_info(graph_exec_info))

Available Models

View all available models:
ollama list
Popular models for scraping:
ModelSizeContextRAM Needed
llama3.2:1b1B128K~2GB
llama3.27B128K~8GB
llama3.3:70b70B128K~40GB
mistral7B128K~8GB
gemma29B128K~6GB
qwen:14b14B32K~10GB
codellama7B16K~8GB
nomic-embed-text-8K~1GB
Pull any model:
ollama pull <model-name>

Performance Tips

Ollama automatically uses GPU if available. Verify with:
ollama ps
For NVIDIA GPUs, ensure CUDA is installed. For Apple Silicon, Metal is used automatically.
For long documents, increase model_tokens:
"llm": {
    "model": "ollama/llama3.2",
    "model_tokens": 128000,  # Maximum context
}
Ollama keeps models in memory for 5 minutes by default. Increase this:
# Set to 1 hour
export OLLAMA_KEEP_ALIVE=1h
ollama serve
For basic scraping, use smaller models:
"model": "ollama/llama3.2:1b"  # Fast and efficient

Troubleshooting

Error: Connection refused to http://localhost:11434Solution: Ensure Ollama is running:
ollama serve
Error: model 'llama3.2' not foundSolution: Pull the model first:
ollama pull llama3.2
Error: System runs out of RAMSolution: Use a smaller model:
"model": "ollama/llama3.2:1b"  # Only 2GB RAM
Error: Failed to parse JSON responseSolution: Add format parameter:
"llm": {
    "model": "ollama/mistral",
    "format": "json",  # Force JSON output
}

Advantages of Ollama

Free

No API costs - run unlimited scraping jobs

Private

Your data never leaves your machine

Fast

No network latency, especially with GPU

Offline

Works without internet connection

Next Steps

OpenAI

Compare with cloud-based OpenAI models

Advanced Config

Learn about proxy rotation and browser settings