Spaces:
Running
Running
metadata
title: Paper Espresso
emoji: ☕️
colorFrom: pink
colorTo: blue
sdk: docker
app_port: 8501
tags:
- streamlit
pinned: false
short_description: Paper Espresso
Paper Espresso
An LLM-powered system that collects, summarizes, and analyzes AI research papers from HuggingFace Daily Papers.
Paper Link: Paper Espresso: From Paper Overload to Research Insight
Features
- Bilingual Summarization — Gemini-generated structured analysis (TL;DR, strengths, limitations, topics, keywords) in English and Chinese
- Multi-Granularity Trending — Daily, weekly, and monthly trend analysis with topic clustering
- Interactive UI — Streamlit web app with topic filtering, language toggle, and paper detail views
- HuggingFace Hub Storage — All data persisted to public datasets for reproducibility
Quick Start
Prerequisites
# Clone and install
git clone https://github.com/Elfsong/Daily_Paper_Reader.git
cd Daily_Paper_Reader
uv sync
Create a .env file in the project root:
GEMINI_API_KEY=your_gemini_api_key
HF_TOKEN=your_huggingface_token
Web App
uv run streamlit run src/streamlit_app.py
CLI: Daily Paper Retriever
src/daily_retrieve.py is a standalone CLI tool for batch collecting and summarizing papers.
Basic Usage
# Collect yesterday's papers
uv run python src/daily_retrieve.py
# Collect a specific date
uv run python src/daily_retrieve.py --date 2026-03-25
# Collect a date range
uv run python src/daily_retrieve.py --date 2026-03-01 --end 2026-03-31
# Parallel collection (16 workers)
uv run python src/daily_retrieve.py --date 2026-01-01 --end 2026-03-31 --workers 16
# Skip pushing to HuggingFace
uv run python src/daily_retrieve.py --date 2026-03-25 --no-push
Options
| Flag | Description | Default |
|---|---|---|
--date DATE |
Start date (YYYY-MM-DD) | Yesterday |
--end DATE |
End date, inclusive (for range) | Same as --date |
--workers N |
Parallel workers for date range | 1 |
--no-push |
Skip pushing to HuggingFace | False |
Pipeline
For each date, the tool runs:
- Check HF — Skip if papers + trending already exist on HuggingFace
- Fetch — Get paper list from HuggingFace Daily Papers API
- Cache Merge — Load existing summaries from local JSON and HF dataset
- Summarize — Call Gemini for papers without summaries (with PDF grounding)
- Trending — Generate daily trend analysis via Gemini (if not on HF)
- Push — Upload papers and trending to HuggingFace Hub
Papers with transient errors (e.g., missing API key) are automatically retried on subsequent runs.
Progress Display
Multi-date runs show a live progress dashboard with per-date progress bars, elapsed time, and API cost tracking:
📰 Daily Paper Retriever [━━━━━━━━━━━━━────────────] 14/30 days ⏱ 03:21 💰 $0.1842 (42 calls, 98,201 tok)
──────────────────────────────────────────────────────────────────────────────────────────────
2026-03-01 [━━━━━━━━━━━━━━━━━━━━━━━━━] 100% (22/22) ✓ done
2026-03-02 [━━━━━━━━━━━━━━━━━━━━━━━━━] 100% (41/41) ✓ synced
2026-03-03 [━━━━━━━━━━━━━━━━━━━━━━━━━] 100% (43/43) ✓ all cached
2026-03-04 [━━━━━━━━━━━━━━───────────] 56% (12/21) Attention Is All You Need...
2026-03-05 [·························] waiting
Data
- Paper summaries:
Elfsong/hf_paper_summary - Trending analyses:
Elfsong/hf_paper_trending