Skip to main content

Overview

The XAI class provides integration with xAI’s Grok language models using an OpenAI-compatible API. It wraps LangChain’s ChatOpenAI class with xAI-specific configuration.
Grok is xAI’s conversational AI model, known for its real-time knowledge and unique personality. It’s designed to be helpful, truthful, and maximally curious.

Class Definition

from scrapegraphai.models import XAI

class XAI(ChatOpenAI):
    """
    A wrapper for the ChatOpenAI class (xAI uses an OpenAI-compatible API) that
    provides default configuration and could be extended with additional methods.
    
    Args:
        llm_config (dict): Configuration parameters for the language model.
    """
Source: scrapegraphai/models/xai.py:8

Constructor

XAI(**llm_config)

Parameters

model
string
required
xAI model identifier. Available options:
  • grok-beta: The main Grok model
  • grok-vision-beta: Grok with vision capabilities
Check xAI documentation for the latest model versions.
api_key
string
required
Your xAI API key. Sign up at x.ai to get access.
The api_key parameter is automatically converted to openai_api_key internally for compatibility with the ChatOpenAI interface.
temperature
float
default:"0.7"
Controls randomness in responses. Range: 0.0 to 2.0.
  • Lower values (0.0-0.3): More focused and deterministic
  • Medium values (0.4-0.9): Balanced creativity and coherence
  • Higher values (1.0-2.0): More creative and varied
max_tokens
int
Maximum number of tokens to generate in the response.
streaming
bool
default:"false"
Enable streaming responses for real-time output.
**kwargs
any
Additional parameters supported by LangChain’s ChatOpenAI class, including:
  • top_p: Nucleus sampling parameter
  • frequency_penalty: Reduce repetition
  • presence_penalty: Encourage topic diversity
  • timeout: Request timeout in seconds

Implementation Details

The XAI class automatically configures the OpenAI base URL to point to xAI’s API:
def __init__(self, **llm_config):
    if "api_key" in llm_config:
        llm_config["openai_api_key"] = llm_config.pop("api_key")
    llm_config["openai_api_base"] = "https://api.x.ai/v1"
    
    super().__init__(**llm_config)
Source: scrapegraphai/models/xai.py:18 This design:
  1. Maps api_key to openai_api_key for consistency
  2. Sets the base URL to https://api.x.ai/v1
  3. Inherits all LangChain ChatOpenAI functionality
  4. Maintains OpenAI-compatible interface

Usage Examples

Basic Usage with SmartScraperGraph

from scrapegraphai.graphs import SmartScraperGraph
from scrapegraphai.models import XAI

graph_config = {
    "llm": {
        "model": "grok-beta",
        "api_key": "your-xai-api-key",
        "temperature": 0.5
    },
    "verbose": True
}

scraper = SmartScraperGraph(
    prompt="Extract all news headlines and their categories",
    source="https://example.com/news",
    config=graph_config
)

result = scraper.run()
print(result)

Direct Model Usage

from scrapegraphai.models import XAI
from langchain_core.messages import HumanMessage

# Initialize the model
llm = XAI(
    model="grok-beta",
    api_key="your-xai-api-key",
    temperature=0.7,
    max_tokens=2000
)

# Use with LangChain
messages = [
    HumanMessage(content="Explain the key principles of web scraping ethics")
]

response = llm.invoke(messages)
print(response.content)

Streaming Responses

from scrapegraphai.models import XAI
from langchain_core.messages import HumanMessage

llm = XAI(
    model="grok-beta",
    api_key="your-xai-api-key",
    streaming=True
)

messages = [HumanMessage(content="Describe modern web scraping techniques")]

print("Grok's response: ", end="")
for chunk in llm.stream(messages):
    print(chunk.content, end="", flush=True)
print()

Real-Time Data Extraction

from scrapegraphai.graphs import SmartScraperGraph

# Grok has access to real-time information
graph_config = {
    "llm": {
        "model": "grok-beta",
        "api_key": "your-xai-api-key",
        "temperature": 0.3
    }
}

scraper = SmartScraperGraph(
    prompt="Extract trending topics and provide context about current events",
    source="https://news.example.com",
    config=graph_config
)

result = scraper.run()
print(result)

With Structured Output

from scrapegraphai.graphs import SmartScraperGraph
from pydantic import BaseModel, Field
from typing import List

class NewsArticle(BaseModel):
    headline: str = Field(description="Article headline")
    category: str = Field(description="News category")
    timestamp: str = Field(description="Publication time")
    summary: str = Field(description="Brief summary")

class NewsList(BaseModel):
    articles: List[NewsArticle]

graph_config = {
    "llm": {
        "model": "grok-beta",
        "api_key": "your-xai-api-key",
        "temperature": 0.0  # Deterministic for structured output
    }
}

scraper = SmartScraperGraph(
    prompt="Extract all news articles with metadata",
    source="https://example.com/news",
    config=graph_config,
    schema=NewsList
)

result = scraper.run()
for article in result.articles:
    print(f"Headline: {article.headline}")
    print(f"Category: {article.category}")
    print(f"Time: {article.timestamp}")
    print(f"Summary: {article.summary}")
    print("---")

Multi-Source Aggregation

from scrapegraphai.graphs import SmartScraperGraph
from typing import List, Dict

def aggregate_news(sources: List[str]) -> List[Dict]:
    """Aggregate news from multiple sources using Grok."""
    graph_config = {
        "llm": {
            "model": "grok-beta",
            "api_key": "your-xai-api-key",
            "temperature": 0.4
        }
    }
    
    results = []
    for source in sources:
        scraper = SmartScraperGraph(
            prompt="Extract top stories with context and relevance",
            source=source,
            config=graph_config
        )
        results.append({
            "source": source,
            "data": scraper.run()
        })
    
    return results

sources = [
    "https://techcrunch.com",
    "https://theverge.com",
    "https://arstechnica.com"
]

aggregated = aggregate_news(sources)
for item in aggregated:
    print(f"\nFrom {item['source']}:")
    print(item['data'])

Configuration Best Practices

Temperature Settings by Use Case

# For factual data extraction
config = {
    "llm": {
        "model": "grok-beta",
        "api_key": "your-key",
        "temperature": 0.0  # Maximum precision
    }
}

# For content analysis with insights
config = {
    "llm": {
        "model": "grok-beta",
        "api_key": "your-key",
        "temperature": 0.5  # Balanced
    }
}

# For creative content generation
config = {
    "llm": {
        "model": "grok-beta",
        "api_key": "your-key",
        "temperature": 0.9  # More creative
    }
}

Performance Optimization

from scrapegraphai.models import XAI

# Optimize for speed
llm = XAI(
    model="grok-beta",
    api_key="your-key",
    max_tokens=500,  # Limit response length
    timeout=30  # Fast timeout
)

# Optimize for quality
llm = XAI(
    model="grok-beta",
    api_key="your-key",
    temperature=0.1,  # Low variance
    max_tokens=3000  # Detailed responses
)

Advanced Features

Custom System Prompts

from scrapegraphai.models import XAI
from langchain_core.messages import SystemMessage, HumanMessage

llm = XAI(
    model="grok-beta",
    api_key="your-xai-api-key"
)

messages = [
    SystemMessage(
        content="You are a data extraction specialist. Always return valid JSON with proper field types."
    ),
    HumanMessage(
        content="Extract product information from this HTML: <html>...</html>"
    )
]

response = llm.invoke(messages)
print(response.content)

Conversation Memory

from scrapegraphai.models import XAI
from langchain_core.messages import SystemMessage, HumanMessage, AIMessage

llm = XAI(
    model="grok-beta",
    api_key="your-xai-api-key"
)

# Multi-turn conversation
conversation = [
    SystemMessage(content="You are helping with web scraping tasks."),
    HumanMessage(content="I need to scrape product prices from an e-commerce site."),
]

# First response
response1 = llm.invoke(conversation)
conversation.append(AIMessage(content=response1.content))

# Follow-up
conversation.append(
    HumanMessage(content="How do I handle pagination?")
)

response2 = llm.invoke(conversation)
print(response2.content)

Error Handling and Retries

from scrapegraphai.graphs import SmartScraperGraph
import time
from typing import Optional

def scrape_with_fallback(
    url: str,
    prompt: str,
    max_retries: int = 3
) -> Optional[dict]:
    """Scrape with exponential backoff and error handling."""
    
    for attempt in range(max_retries):
        try:
            graph_config = {
                "llm": {
                    "model": "grok-beta",
                    "api_key": "your-xai-api-key",
                    "timeout": 60
                }
            }
            
            scraper = SmartScraperGraph(
                prompt=prompt,
                source=url,
                config=graph_config
            )
            
            return scraper.run()
            
        except Exception as e:
            if attempt < max_retries - 1:
                wait_time = 2 ** attempt
                print(f"Attempt {attempt + 1} failed: {e}")
                print(f"Retrying in {wait_time}s...")
                time.sleep(wait_time)
            else:
                print(f"All attempts failed: {e}")
                return None

result = scrape_with_fallback(
    "https://example.com",
    "Extract main content"
)

Batch Processing

from scrapegraphai.models import XAI
from langchain_core.messages import HumanMessage
import concurrent.futures

def process_prompt(prompt: str) -> str:
    """Process a single prompt."""
    llm = XAI(
        model="grok-beta",
        api_key="your-xai-api-key"
    )
    response = llm.invoke([HumanMessage(content=prompt)])
    return response.content

prompts = [
    "Summarize this article: ...",
    "Extract email addresses from: ...",
    "List product features: ...",
    "Identify key dates: ..."
]

# Process in parallel
with concurrent.futures.ThreadPoolExecutor(max_workers=4) as executor:
    results = list(executor.map(process_prompt, prompts))

for i, result in enumerate(results):
    print(f"\nResult {i+1}:")
    print(result)

Comparison with Other Models

XAI vs Other Providers

FeatureXAI GrokOpenAI GPT-4DeepSeek
Real-time dataYesLimitedNo
PersonalityUnique, curiousProfessionalTechnical
API compatibilityOpenAI-likeNativeOpenAI-like
PricingCompetitivePremiumBudget
Best forCurrent eventsGeneral purposeCode/tech

When to Use XAI Grok

Use XAI Grok when:
  • Need real-time or current information
  • Want a conversational, curious AI personality
  • Scraping news or trending content
  • Require contextual understanding of recent events
  • Want OpenAI-compatible API with unique features
Consider alternatives when:
  • Need maximum accuracy for technical tasks
  • Budget is primary concern
  • Require specialized domain knowledge
  • Need vision capabilities (use grok-vision-beta)

Environment Variables

For security best practices, use environment variables:
import os
from scrapegraphai.graphs import SmartScraperGraph

graph_config = {
    "llm": {
        "model": "grok-beta",
        "api_key": os.getenv("XAI_API_KEY"),
        "temperature": 0.5
    }
}

scraper = SmartScraperGraph(
    prompt="Extract content",
    source="https://example.com",
    config=graph_config
)
Set the environment variable:
export XAI_API_KEY="your-xai-api-key-here"
Or use a .env file:
# .env
XAI_API_KEY=your-xai-api-key-here
from dotenv import load_dotenv
import os

load_dotenv()

api_key = os.getenv("XAI_API_KEY")

Common Use Cases

News Aggregation

from scrapegraphai.graphs import SmartScraperGraph

graph_config = {
    "llm": {
        "model": "grok-beta",
        "api_key": "your-key",
        "temperature": 0.4
    }
}

scraper = SmartScraperGraph(
    prompt="Extract headlines with context about why they're significant",
    source="https://news.example.com",
    config=graph_config
)

result = scraper.run()

Social Media Analysis

from scrapegraphai.graphs import SmartScraperGraph

graph_config = {
    "llm": {
        "model": "grok-beta",
        "api_key": "your-key"
    }
}

scraper = SmartScraperGraph(
    prompt="Analyze trending topics and sentiment",
    source="https://social-platform.com/trending",
    config=graph_config
)

trends = scraper.run()

Research Assistant

from scrapegraphai.models import XAI
from langchain_core.messages import HumanMessage

llm = XAI(
    model="grok-beta",
    api_key="your-key",
    temperature=0.6
)

research_query = HumanMessage(
    content="""Analyze this research paper abstract and:
    1. Identify key findings
    2. List methodologies used
    3. Suggest related research areas
    
    Abstract: ...
    """
)

response = llm.invoke([research_query])
print(response.content)

Models Overview

All available custom models

DeepSeek

Alternative cost-effective LLM

SmartScraperGraph

Main scraping graph using LLMs

Configuration

Detailed configuration guide