scrapling / docs.html
GraziePrego's picture
Add HTML API documentation
246cb17 verified
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>CyberScraper 2077 API Documentation</title>
<style>
* {
margin: 0;
padding: 0;
box-sizing: border-box;
}
body {
font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, sans-serif;
line-height: 1.6;
color: #333;
background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
min-height: 100vh;
padding: 20px;
}
.container {
max-width: 1200px;
margin: 0 auto;
background: white;
border-radius: 12px;
box-shadow: 0 10px 40px rgba(0,0,0,0.2);
overflow: hidden;
}
header {
background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
color: white;
padding: 40px;
text-align: center;
}
header h1 {
font-size: 2.5em;
margin-bottom: 10px;
}
header p {
font-size: 1.2em;
opacity: 0.9;
}
.version {
display: inline-block;
background: rgba(255,255,255,0.2);
padding: 5px 15px;
border-radius: 20px;
margin-top: 15px;
font-size: 0.9em;
}
.content {
padding: 40px;
}
.section {
margin-bottom: 40px;
}
.section h2 {
color: #667eea;
border-bottom: 3px solid #667eea;
padding-bottom: 10px;
margin-bottom: 20px;
font-size: 1.8em;
}
.endpoint {
background: #f8f9fa;
border-left: 4px solid #667eea;
padding: 20px;
margin-bottom: 20px;
border-radius: 0 8px 8px 0;
}
.endpoint-header {
display: flex;
align-items: center;
margin-bottom: 15px;
flex-wrap: wrap;
gap: 10px;
}
.method {
display: inline-block;
padding: 5px 12px;
border-radius: 5px;
font-weight: bold;
font-size: 0.9em;
min-width: 80px;
text-align: center;
}
.method.get { background: #61affe; color: white; }
.method.post { background: #49cc90; color: white; }
.method.delete { background: #f93e3e; color: white; }
.path {
font-family: 'Courier New', monospace;
font-size: 1.1em;
color: #333;
font-weight: 600;
}
.description {
color: #666;
margin-bottom: 15px;
}
.parameters, .response {
background: white;
padding: 15px;
border-radius: 5px;
margin: 10px 0;
}
.parameters h4, .response h4 {
color: #667eea;
margin-bottom: 10px;
}
.code-block {
background: #2d2d2d;
color: #f8f8f2;
padding: 15px;
border-radius: 5px;
overflow-x: auto;
font-family: 'Courier New', monospace;
font-size: 0.9em;
margin: 10px 0;
}
.code-block pre {
margin: 0;
}
.quick-start {
background: #e8f4f8;
padding: 20px;
border-radius: 8px;
margin-bottom: 20px;
}
.quick-start ol {
margin-left: 20px;
}
.quick-start li {
margin: 8px 0;
}
.example-box {
background: #fff3cd;
border: 1px solid #ffc107;
padding: 15px;
border-radius: 5px;
margin: 10px 0;
}
.example-box h4 {
color: #856404;
margin-bottom: 10px;
}
footer {
background: #f8f9fa;
padding: 20px;
text-align: center;
color: #666;
border-top: 1px solid #ddd;
}
@media (max-width: 768px) {
header h1 {
font-size: 1.8em;
}
.content {
padding: 20px;
}
.endpoint-header {
flex-direction: column;
align-items: flex-start;
}
}
</style>
</head>
<body>
<div class="container">
<header>
<h1>🕷️ CyberScraper 2077 API</h1>
<p>Advanced Web Scraping API with AI-Powered Content Extraction</p>
<span class="version">Version 1.0.0</span>
</header>
<div class="content">
<div class="section">
<h2>🚀 Quick Start</h2>
<div class="quick-start">
<ol>
<li>Make a simple scrape request to <code>/api/scrape</code></li>
<li>For multiple requests, create a session first using <code>/api/session</code></li>
<li>Use the session ID for subsequent requests to <code>/api/session/{session_id}/scrape</code></li>
<li>Always close sessions when done using <code>DELETE /api/session/{session_id}</code></li>
</ol>
</div>
</div>
<div class="section">
<h2>📡 API Endpoints</h2>
<div class="endpoint">
<div class="endpoint-header">
<span class="method get">GET</span>
<span class="path">/health</span>
</div>
<p class="description">Check if the API is running</p>
<div class="example-box">
<h4>Example:</h4>
<div class="code-block"><pre>curl https://grazieprego-scrapling.hf.space/health</pre></div>
</div>
<div class="response">
<h4>Response:</h4>
<div class="code-block"><pre>{
"status": "ok",
"message": "CyberScraper 2077 API is running"
}</pre></div>
</div>
</div>
<div class="endpoint">
<div class="endpoint-header">
<span class="method post">POST</span>
<span class="path">/api/scrape</span>
</div>
<p class="description">Stateless scrape request - creates a new extractor for each request</p>
<div class="parameters">
<h4>Request Body:</h4>
<ul>
<li><strong>url</strong> (string) - The URL to scrape</li>
<li><strong>query</strong> (string) - The extraction query/instruction</li>
<li><strong>model_name</strong> (string, optional) - AI model to use (default: 'alias-fast')</li>
</ul>
</div>
<div class="example-box">
<h4>Example (cURL):</h4>
<div class="code-block"><pre>curl -X POST https://grazieprego-scrapling.hf.space/api/scrape \
-H "Content-Type: application/json" \
-d '{
"url": "https://example.com",
"query": "Extract all product prices"
}'</pre></div>
<h4>Example (Python):</h4>
<div class="code-block"><pre>import requests
response = requests.post(
'https://grazieprego-scrapling.hf.space/api/scrape',
json={
'url': 'https://example.com',
'query': 'Extract prices'
}
)
print(response.json())</pre></div>
</div>
</div>
<div class="endpoint">
<div class="endpoint-header">
<span class="method post">POST</span>
<span class="path">/api/session</span>
</div>
<p class="description">Create a persistent scraping session for multiple requests</p>
<div class="parameters">
<h4>Request Body:</h4>
<ul>
<li><strong>model_name</strong> (string, optional) - AI model to use (default: 'alias-fast')</li>
</ul>
</div>
<div class="example-box">
<h4>Example:</h4>
<div class="code-block"><pre>curl -X POST https://grazieprego-scrapling.hf.space/api/session \
-H "Content-Type: application/json" \
-d '{"model_name": "alias-fast"}'</pre></div>
</div>
</div>
<div class="endpoint">
<div class="endpoint-header">
<span class="method post">POST</span>
<span class="path">/api/session/{session_id}/scrape</span>
</div>
<p class="description">Scrape using an existing session context (more efficient for multiple requests)</p>
<div class="parameters">
<h4>Path Parameters:</h4>
<ul>
<li><strong>session_id</strong> (string) - UUID of the session</li>
</ul>
<h4>Request Body:</h4>
<ul>
<li><strong>url</strong> (string) - The URL to scrape</li>
<li><strong>query</strong> (string) - The extraction query</li>
<li><strong>model_name</strong> (string, optional)</li>
</ul>
</div>
<div class="example-box">
<h4>Example:</h4>
<div class="code-block"><pre>curl -X POST https://grazieprego-scrapling.hf.space/api/session/uuid-here/scrape \
-H "Content-Type: application/json" \
-d '{
"url": "https://example.com/page1",
"query": "Extract titles"
}'</pre></div>
</div>
</div>
<div class="endpoint">
<div class="endpoint-header">
<span class="method delete">DELETE</span>
<span class="path">/api/session/{session_id}</span>
</div>
<p class="description">Close a session and release resources</p>
<div class="parameters">
<h4>Path Parameters:</h4>
<ul>
<li><strong>session_id</strong> (string) - UUID of the session to close</li>
</ul>
</div>
<div class="example-box">
<h4>Example:</h4>
<div class="code-block"><pre>curl -X DELETE https://grazieprego-scrapling.hf.space/api/session/uuid-here</pre></div>
</div>
</div>
</div>
<div class="section">
<h2>💡 Best Practices</h2>
<ul>
<li>Use stateless <code>/api/scrape</code> for one-off requests</li>
<li>Use sessions for batch processing multiple URLs</li>
<li>Always close sessions when finished to free resources</li>
<li>Handle errors gracefully (500 errors may occur on complex sites)</li>
<li>Set appropriate timeouts for slow-loading pages</li>
<li>Implement retry logic for production use</li>
</ul>
</div>
<div class="section">
<h2>⚠️ Error Handling</h2>
<div class="parameters">
<ul>
<li><strong>404</strong> - Session not found (for session endpoints)</li>
<li><strong>500</strong> - Internal server error - check the detail message</li>
</ul>
<p><strong>Common Issues:</strong></p>
<ul>
<li>URL unreachable or timeout</li>
<li>JavaScript-heavy sites may require different approaches</li>
<li>Bot protection may block requests</li>
</ul>
</div>
</div>
</div>
<footer>
<p>CyberScraper 2077 API - Powered by Scrapling & AI</p>
<p>Base URL: <a href="https://grazieprego-scrapling.hf.space">https://grazieprego-scrapling.hf.space</a></p>
</footer>
</div>
</body>
</html>