CyberScraper 2077 API Documentation

🚀 Quick Start

Make a simple scrape request to /api/scrape
For multiple requests, create a session first using /api/session
Use the session ID for subsequent requests to /api/session/{session_id}/scrape
Always close sessions when done using DELETE /api/session/{session_id}

📡 API Endpoints

GET /health

Check if the API is running

Example:

curl https://grazieprego-scrapling.hf.space/health

Response:

{
  "status": "ok",
  "message": "CyberScraper 2077 API is running"
}

POST /api/scrape

Stateless scrape request - creates a new extractor for each request

Request Body:

url (string) - The URL to scrape
query (string) - The extraction query/instruction
model_name (string, optional) - AI model to use (default: 'alias-fast')

Example (cURL):

curl -X POST https://grazieprego-scrapling.hf.space/api/scrape \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com",
    "query": "Extract all product prices"
  }'

Example (Python):

import requests

response = requests.post(
    'https://grazieprego-scrapling.hf.space/api/scrape',
    json={
        'url': 'https://example.com',
        'query': 'Extract prices'
    }
)
print(response.json())

POST /api/session

Create a persistent scraping session for multiple requests

Request Body:

model_name (string, optional) - AI model to use (default: 'alias-fast')

Example:

curl -X POST https://grazieprego-scrapling.hf.space/api/session \
  -H "Content-Type: application/json" \
  -d '{"model_name": "alias-fast"}'

POST /api/session/{session_id}/scrape

Scrape using an existing session context (more efficient for multiple requests)

Path Parameters:

session_id (string) - UUID of the session

Request Body:

url (string) - The URL to scrape
query (string) - The extraction query
model_name (string, optional)

Example:

curl -X POST https://grazieprego-scrapling.hf.space/api/session/uuid-here/scrape \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com/page1",
    "query": "Extract titles"
  }'

DELETE /api/session/{session_id}

Close a session and release resources

Path Parameters:

session_id (string) - UUID of the session to close

Example:

curl -X DELETE https://grazieprego-scrapling.hf.space/api/session/uuid-here

💡 Best Practices

Use stateless /api/scrape for one-off requests
Use sessions for batch processing multiple URLs
Always close sessions when finished to free resources
Handle errors gracefully (500 errors may occur on complex sites)
Set appropriate timeouts for slow-loading pages
Implement retry logic for production use

⚠️ Error Handling

404 - Session not found (for session endpoints)
500 - Internal server error - check the detail message

Common Issues:

URL unreachable or timeout
JavaScript-heavy sites may require different approaches
Bot protection may block requests