| <!DOCTYPE html> |
| <html lang="en"> |
| <head> |
| <meta charset="UTF-8"> |
| <meta name="viewport" content="width=device-width, initial-scale=1.0"> |
| <title>CyberScraper 2077 API Documentation</title> |
| <style> |
| * { |
| margin: 0; |
| padding: 0; |
| box-sizing: border-box; |
| } |
| body { |
| font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, sans-serif; |
| line-height: 1.6; |
| color: #333; |
| background: linear-gradient(135deg, #667eea 0%, #764ba2 100%); |
| min-height: 100vh; |
| padding: 20px; |
| } |
| .container { |
| max-width: 1200px; |
| margin: 0 auto; |
| background: white; |
| border-radius: 12px; |
| box-shadow: 0 10px 40px rgba(0,0,0,0.2); |
| overflow: hidden; |
| } |
| header { |
| background: linear-gradient(135deg, #667eea 0%, #764ba2 100%); |
| color: white; |
| padding: 40px; |
| text-align: center; |
| } |
| header h1 { |
| font-size: 2.5em; |
| margin-bottom: 10px; |
| } |
| header p { |
| font-size: 1.2em; |
| opacity: 0.9; |
| } |
| .version { |
| display: inline-block; |
| background: rgba(255,255,255,0.2); |
| padding: 5px 15px; |
| border-radius: 20px; |
| margin-top: 15px; |
| font-size: 0.9em; |
| } |
| .content { |
| padding: 40px; |
| } |
| .section { |
| margin-bottom: 40px; |
| } |
| .section h2 { |
| color: #667eea; |
| border-bottom: 3px solid #667eea; |
| padding-bottom: 10px; |
| margin-bottom: 20px; |
| font-size: 1.8em; |
| } |
| .endpoint { |
| background: #f8f9fa; |
| border-left: 4px solid #667eea; |
| padding: 20px; |
| margin-bottom: 20px; |
| border-radius: 0 8px 8px 0; |
| } |
| .endpoint-header { |
| display: flex; |
| align-items: center; |
| margin-bottom: 15px; |
| flex-wrap: wrap; |
| gap: 10px; |
| } |
| .method { |
| display: inline-block; |
| padding: 5px 12px; |
| border-radius: 5px; |
| font-weight: bold; |
| font-size: 0.9em; |
| min-width: 80px; |
| text-align: center; |
| } |
| .method.get { background: #61affe; color: white; } |
| .method.post { background: #49cc90; color: white; } |
| .method.delete { background: #f93e3e; color: white; } |
| .path { |
| font-family: 'Courier New', monospace; |
| font-size: 1.1em; |
| color: #333; |
| font-weight: 600; |
| } |
| .description { |
| color: #666; |
| margin-bottom: 15px; |
| } |
| .parameters, .response { |
| background: white; |
| padding: 15px; |
| border-radius: 5px; |
| margin: 10px 0; |
| } |
| .parameters h4, .response h4 { |
| color: #667eea; |
| margin-bottom: 10px; |
| } |
| .code-block { |
| background: #2d2d2d; |
| color: #f8f8f2; |
| padding: 15px; |
| border-radius: 5px; |
| overflow-x: auto; |
| font-family: 'Courier New', monospace; |
| font-size: 0.9em; |
| margin: 10px 0; |
| } |
| .code-block pre { |
| margin: 0; |
| } |
| .quick-start { |
| background: #e8f4f8; |
| padding: 20px; |
| border-radius: 8px; |
| margin-bottom: 20px; |
| } |
| .quick-start ol { |
| margin-left: 20px; |
| } |
| .quick-start li { |
| margin: 8px 0; |
| } |
| .example-box { |
| background: #fff3cd; |
| border: 1px solid #ffc107; |
| padding: 15px; |
| border-radius: 5px; |
| margin: 10px 0; |
| } |
| .example-box h4 { |
| color: #856404; |
| margin-bottom: 10px; |
| } |
| footer { |
| background: #f8f9fa; |
| padding: 20px; |
| text-align: center; |
| color: #666; |
| border-top: 1px solid #ddd; |
| } |
| @media (max-width: 768px) { |
| header h1 { |
| font-size: 1.8em; |
| } |
| .content { |
| padding: 20px; |
| } |
| .endpoint-header { |
| flex-direction: column; |
| align-items: flex-start; |
| } |
| } |
| </style> |
| </head> |
| <body> |
| <div class="container"> |
| <header> |
| <h1>🕷️ CyberScraper 2077 API</h1> |
| <p>Advanced Web Scraping API with AI-Powered Content Extraction</p> |
| <span class="version">Version 1.0.0</span> |
| </header> |
| |
| <div class="content"> |
| <div class="section"> |
| <h2>🚀 Quick Start</h2> |
| <div class="quick-start"> |
| <ol> |
| <li>Make a simple scrape request to <code>/api/scrape</code></li> |
| <li>For multiple requests, create a session first using <code>/api/session</code></li> |
| <li>Use the session ID for subsequent requests to <code>/api/session/{session_id}/scrape</code></li> |
| <li>Always close sessions when done using <code>DELETE /api/session/{session_id}</code></li> |
| </ol> |
| </div> |
| </div> |
|
|
| <div class="section"> |
| <h2>📡 API Endpoints</h2> |
| |
| <div class="endpoint"> |
| <div class="endpoint-header"> |
| <span class="method get">GET</span> |
| <span class="path">/health</span> |
| </div> |
| <p class="description">Check if the API is running</p> |
| <div class="example-box"> |
| <h4>Example:</h4> |
| <div class="code-block"><pre>curl https://grazieprego-scrapling.hf.space/health</pre></div> |
| </div> |
| <div class="response"> |
| <h4>Response:</h4> |
| <div class="code-block"><pre>{ |
| "status": "ok", |
| "message": "CyberScraper 2077 API is running" |
| }</pre></div> |
| </div> |
| </div> |
|
|
| <div class="endpoint"> |
| <div class="endpoint-header"> |
| <span class="method post">POST</span> |
| <span class="path">/api/scrape</span> |
| </div> |
| <p class="description">Stateless scrape request - creates a new extractor for each request</p> |
| <div class="parameters"> |
| <h4>Request Body:</h4> |
| <ul> |
| <li><strong>url</strong> (string) - The URL to scrape</li> |
| <li><strong>query</strong> (string) - The extraction query/instruction</li> |
| <li><strong>model_name</strong> (string, optional) - AI model to use (default: 'alias-fast')</li> |
| </ul> |
| </div> |
| <div class="example-box"> |
| <h4>Example (cURL):</h4> |
| <div class="code-block"><pre>curl -X POST https://grazieprego-scrapling.hf.space/api/scrape \ |
| -H "Content-Type: application/json" \ |
| -d '{ |
| "url": "https://example.com", |
| "query": "Extract all product prices" |
| }'</pre></div> |
| <h4>Example (Python):</h4> |
| <div class="code-block"><pre>import requests |
|
|
| response = requests.post( |
| 'https://grazieprego-scrapling.hf.space/api/scrape', |
| json={ |
| 'url': 'https://example.com', |
| 'query': 'Extract prices' |
| } |
| ) |
| print(response.json())</pre></div> |
| </div> |
| </div> |
|
|
| <div class="endpoint"> |
| <div class="endpoint-header"> |
| <span class="method post">POST</span> |
| <span class="path">/api/session</span> |
| </div> |
| <p class="description">Create a persistent scraping session for multiple requests</p> |
| <div class="parameters"> |
| <h4>Request Body:</h4> |
| <ul> |
| <li><strong>model_name</strong> (string, optional) - AI model to use (default: 'alias-fast')</li> |
| </ul> |
| </div> |
| <div class="example-box"> |
| <h4>Example:</h4> |
| <div class="code-block"><pre>curl -X POST https://grazieprego-scrapling.hf.space/api/session \ |
| -H "Content-Type: application/json" \ |
| -d '{"model_name": "alias-fast"}'</pre></div> |
| </div> |
| </div> |
|
|
| <div class="endpoint"> |
| <div class="endpoint-header"> |
| <span class="method post">POST</span> |
| <span class="path">/api/session/{session_id}/scrape</span> |
| </div> |
| <p class="description">Scrape using an existing session context (more efficient for multiple requests)</p> |
| <div class="parameters"> |
| <h4>Path Parameters:</h4> |
| <ul> |
| <li><strong>session_id</strong> (string) - UUID of the session</li> |
| </ul> |
| <h4>Request Body:</h4> |
| <ul> |
| <li><strong>url</strong> (string) - The URL to scrape</li> |
| <li><strong>query</strong> (string) - The extraction query</li> |
| <li><strong>model_name</strong> (string, optional)</li> |
| </ul> |
| </div> |
| <div class="example-box"> |
| <h4>Example:</h4> |
| <div class="code-block"><pre>curl -X POST https://grazieprego-scrapling.hf.space/api/session/uuid-here/scrape \ |
| -H "Content-Type: application/json" \ |
| -d '{ |
| "url": "https://example.com/page1", |
| "query": "Extract titles" |
| }'</pre></div> |
| </div> |
| </div> |
|
|
| <div class="endpoint"> |
| <div class="endpoint-header"> |
| <span class="method delete">DELETE</span> |
| <span class="path">/api/session/{session_id}</span> |
| </div> |
| <p class="description">Close a session and release resources</p> |
| <div class="parameters"> |
| <h4>Path Parameters:</h4> |
| <ul> |
| <li><strong>session_id</strong> (string) - UUID of the session to close</li> |
| </ul> |
| </div> |
| <div class="example-box"> |
| <h4>Example:</h4> |
| <div class="code-block"><pre>curl -X DELETE https://grazieprego-scrapling.hf.space/api/session/uuid-here</pre></div> |
| </div> |
| </div> |
| </div> |
|
|
| <div class="section"> |
| <h2>💡 Best Practices</h2> |
| <ul> |
| <li>Use stateless <code>/api/scrape</code> for one-off requests</li> |
| <li>Use sessions for batch processing multiple URLs</li> |
| <li>Always close sessions when finished to free resources</li> |
| <li>Handle errors gracefully (500 errors may occur on complex sites)</li> |
| <li>Set appropriate timeouts for slow-loading pages</li> |
| <li>Implement retry logic for production use</li> |
| </ul> |
| </div> |
|
|
| <div class="section"> |
| <h2>⚠️ Error Handling</h2> |
| <div class="parameters"> |
| <ul> |
| <li><strong>404</strong> - Session not found (for session endpoints)</li> |
| <li><strong>500</strong> - Internal server error - check the detail message</li> |
| </ul> |
| <p><strong>Common Issues:</strong></p> |
| <ul> |
| <li>URL unreachable or timeout</li> |
| <li>JavaScript-heavy sites may require different approaches</li> |
| <li>Bot protection may block requests</li> |
| </ul> |
| </div> |
| </div> |
| </div> |
|
|
| <footer> |
| <p>CyberScraper 2077 API - Powered by Scrapling & AI</p> |
| <p>Base URL: <a href="https://grazieprego-scrapling.hf.space">https://grazieprego-scrapling.hf.space</a></p> |
| </footer> |
| </div> |
| </body> |
| </html> |
|
|