ymlin105 Cursor commited on
Commit
78cfff7
·
1 Parent(s): b4bfa19

chore: reorganize documentation structure and clean repository root

Browse files

Consolidate docs into architecture/development/performance/presentation sections, refresh navigation links, and move root-level logs and legacy data snapshots into dedicated archive locations to keep the repo entrypoints clean.

Co-authored-by: Cursor <cursoragent@cursor.com>

Files changed (25) hide show
  1. .gitignore +4 -0
  2. CHANGELOG.md +4 -2
  3. CONTRIBUTING.md +8 -6
  4. README.md +9 -7
  5. books_cleaned.csv → data/legacy_root_exports/books_cleaned.csv +0 -0
  6. books_descriptions.csv → data/legacy_root_exports/books_descriptions.csv +0 -0
  7. books_descriptions.txt → data/legacy_root_exports/books_descriptions.txt +0 -0
  8. books_with_emotions.csv → data/legacy_root_exports/books_with_emotions.csv +0 -0
  9. docs/README.md +51 -19
  10. docs/{ARCHITECTURE_DIAGRAMS.md → architecture/ARCHITECTURE_DIAGRAMS.md} +1 -1
  11. docs/{ARCHITECTURE_IMPROVEMENTS.md → architecture/ARCHITECTURE_IMPROVEMENTS.md} +0 -0
  12. docs/{ROUTER_OPTIMIZATION.md → architecture/ROUTER_OPTIMIZATION.md} +0 -0
  13. docs/{TECHNICAL_REPORT.md → architecture/TECHNICAL_REPORT.md} +0 -0
  14. docs/archived/README.md +48 -0
  15. 代码评审报告.docx → docs/archived/assets/代码评审报告.docx +0 -0
  16. docs/{DEVELOPMENT.md → development/DEVELOPMENT.md} +0 -0
  17. docs/{build_guide.md → development/build_guide.md} +0 -0
  18. docs/{huggingface_deployment.md → development/huggingface_deployment.md} +0 -0
  19. docs/{LATENCY_OPTIMIZATION.md → performance/LATENCY_OPTIMIZATION.md} +0 -0
  20. docs/{TEST_COVERAGE.md → performance/TEST_COVERAGE.md} +0 -0
  21. docs/{memory_optimization.md → performance/memory_optimization.md} +0 -0
  22. docs/{performance_debugging_report.md → performance/performance_debugging_report.md} +0 -0
  23. docs/{interview_guide.md → presentation/interview_guide.md} +0 -0
  24. docs/{roadmap.md → presentation/roadmap.md} +0 -0
  25. server_final.log +0 -42
.gitignore CHANGED
@@ -47,6 +47,10 @@ Thumbs.db
47
  *.csv
48
  *.txt
49
  !requirements.txt
 
 
 
 
50
  # data/books_processed.csv and others are now loaded at runtime from HF
51
 
52
  # Model files (downloaded at runtime from HF Hub)
 
47
  *.csv
48
  *.txt
49
  !requirements.txt
50
+ # keep historical root snapshots after cleanup
51
+ !data/legacy_root_exports/
52
+ !data/legacy_root_exports/*.csv
53
+ !data/legacy_root_exports/*.txt
54
  # data/books_processed.csv and others are now loaded at runtime from HF
55
 
56
  # Model files (downloaded at runtime from HF Hub)
CHANGELOG.md CHANGED
@@ -9,7 +9,9 @@ All notable changes to this project will be documented in this file.
9
  - **Version alignment**: Updated `src/main.py`, `web/package.json` to 2.6.0. Unified version references across README, docs, technical_report, experiment_archive, interview_guide, roadmap.
10
  - **Freeze notice**: Added to README, docs/README, experiment_archive, roadmap.
11
 
12
- ## [Unreleased]
 
 
13
 
14
  ### Added - A/B Testing & RAG Diversity (2026-02-12)
15
  - **A/B testing framework** (`src/core/ab_experiments.py`): Minimal experiment assignment via `get_variant(user_id, experiment_id)`; `get_experiment_config()` returns control/treatment params. Enable with `AB_EXPERIMENTS_ENABLED=true`.
@@ -28,7 +30,7 @@ All notable changes to this project will be documented in this file.
28
  - **Fallback rules improved**: Replaced brittle `len(words) <= 2` with NL keyword detection (`ROUTER_NL_KEYWORDS`: like, similar, recommend, want, looking, ...). Short queries (≤6 words) without NL keywords → FAST; queries with NL keywords → DEEP.
29
  - **Config**: `natural_language_keywords` in router config; `ROUTER_NL_KEYWORDS` in `src/config.py`.
30
 
31
- ### Added - Latency Optimizations (LATENCY_OPTIMIZATION.md)
32
  - **1. 裁剪候选集**: `RERANK_CANDIDATES_MAX=20` (env overridable); rerank top 20 instead of 50.
33
  - **2. ColBERT**: `RERANKER_BACKEND=colbert`; optional `llama-index-postprocessor-colbert-rerank`.
34
  - **3. Rerank 异步化**: `fast=true` skips rerank (~150ms); `async_rerank=true` returns RRF first, reranks in background, next request gets cached reranked.
 
9
  - **Version alignment**: Updated `src/main.py`, `web/package.json` to 2.6.0. Unified version references across README, docs, technical_report, experiment_archive, interview_guide, roadmap.
10
  - **Freeze notice**: Added to README, docs/README, experiment_archive, roadmap.
11
 
12
+ ## [Post-freeze maintenance] - 2026-02-12
13
+
14
+ Post-freeze updates after v2.6.0 baseline lock. Scope: bug fixes, architecture cleanup, and documentation/engineering maintenance.
15
 
16
  ### Added - A/B Testing & RAG Diversity (2026-02-12)
17
  - **A/B testing framework** (`src/core/ab_experiments.py`): Minimal experiment assignment via `get_variant(user_id, experiment_id)`; `get_experiment_config()` returns control/treatment params. Enable with `AB_EXPERIMENTS_ENABLED=true`.
 
30
  - **Fallback rules improved**: Replaced brittle `len(words) <= 2` with NL keyword detection (`ROUTER_NL_KEYWORDS`: like, similar, recommend, want, looking, ...). Short queries (≤6 words) without NL keywords → FAST; queries with NL keywords → DEEP.
31
  - **Config**: `natural_language_keywords` in router config; `ROUTER_NL_KEYWORDS` in `src/config.py`.
32
 
33
+ ### Added - Latency Optimizations (`docs/performance/LATENCY_OPTIMIZATION.md`)
34
  - **1. 裁剪候选集**: `RERANK_CANDIDATES_MAX=20` (env overridable); rerank top 20 instead of 50.
35
  - **2. ColBERT**: `RERANKER_BACKEND=colbert`; optional `llama-index-postprocessor-colbert-rerank`.
36
  - **3. Rerank 异步化**: `fast=true` skips rerank (~150ms); `async_rerank=true` returns RRF first, reranks in background, next request gets cached reranked.
CONTRIBUTING.md CHANGED
@@ -23,7 +23,7 @@ python scripts/init_sqlite_db.py
23
  make run
24
  ```
25
 
26
- 详见 [README](README.md) 与 [Build Guide](docs/build_guide.md)。
27
 
28
  ---
29
 
@@ -40,8 +40,10 @@ make run
40
  ├── config/ # 路由等配置
41
  ├── data/ # 数据目录(不入库)
42
  └── docs/ # 文档
43
- ├── DEVELOPMENT.md # 开发指南(召回、路由)
44
- ├── TECHNICAL_REPORT.md # 技术报告
 
 
45
  └── ...
46
  ```
47
 
@@ -73,8 +75,8 @@ make run
73
 
74
  | 场景 | 参考文档 |
75
  |------|----------|
76
- | 添加新召回通道 | [docs/DEVELOPMENT.md § 一](docs/DEVELOPMENT.md#一如何添加新的召回通道) |
77
- | 调整路由规则 / 关键词 | [docs/DEVELOPMENT.md § 二](docs/DEVELOPMENT.md#二如何调整路由规则) |
78
  | 修复 Bug | 先复现,再最小改动修复 |
79
  | 性能优化 | 对比前后指标,在 CHANGELOG 中记录 |
80
 
@@ -83,7 +85,7 @@ make run
83
  ## 五、文档与变更
84
 
85
  - **CHANGELOG.md**: 用户可见的变更应在此记录
86
- - **docs/DEVELOPMENT.md**: 开发相关扩展或修改流程时同步更新
87
  - **README.md**: 面向使用者的说明,重大功能变更时更新
88
 
89
  ---
 
23
  make run
24
  ```
25
 
26
+ 详见 [README](README.md) 与 [Build Guide](docs/development/build_guide.md)。
27
 
28
  ---
29
 
 
40
  ├── config/ # 路由等配置
41
  ├── data/ # 数据目录(不入库)
42
  └── docs/ # 文档
43
+ ├── architecture/ # 技术报告、架构图
44
+ ├── development/ # 开发与部署指南
45
+ ├── performance/ # 性能优化与调试
46
+ ├── presentation/ # 面试材料与演进路线
47
  └── ...
48
  ```
49
 
 
75
 
76
  | 场景 | 参考文档 |
77
  |------|----------|
78
+ | 添加新召回通道 | [docs/development/DEVELOPMENT.md § 一](docs/development/DEVELOPMENT.md#一如何添加新的召回通道) |
79
+ | 调整路由规则 / 关键词 | [docs/development/DEVELOPMENT.md § 二](docs/development/DEVELOPMENT.md#二如何调整路由规则) |
80
  | 修复 Bug | 先复现,再最小改动修复 |
81
  | 性能优化 | 对比前后指标,在 CHANGELOG 中记录 |
82
 
 
85
  ## 五、文档与变更
86
 
87
  - **CHANGELOG.md**: 用户可见的变更应在此记录
88
+ - **docs/development/DEVELOPMENT.md**: 开发相关扩展或修改流程时同步更新
89
  - **README.md**: 面向使用者的说明,重大功能变更时更新
90
 
91
  ---
README.md CHANGED
@@ -7,7 +7,7 @@ app_port: 8000
7
 
8
  # Intelligent Book Recommendation System
9
 
10
- *Frozen at v2.6.0 — maintenance mode for portfolio use.*
11
 
12
  ## Problem
13
 
@@ -64,14 +64,16 @@ cd web && npm install && npm run dev # UI http://localhost:5173
64
 
65
  ## Documentation
66
 
 
 
67
  | Doc | Purpose |
68
  |:---|:---|
69
- | [Technical Report](docs/TECHNICAL_REPORT.md) | Architecture, design decisions |
70
- | [Development Guide](docs/DEVELOPMENT.md) | 添加召回通道、调整路由规则 |
71
- | [Contributing](CONTRIBUTING.md) | 贡献者指南 |
72
- | [Experiment Archive](docs/experiments/experiment_archive.md) | Full experiment log (V1.0 → v2.6.0) |
73
- | [Interview Guide](docs/interview_guide.md) | Q&A, STAR cases |
74
- | [Build Guide](docs/build_guide.md) | Deployment instructions |
75
 
76
  ## License
77
 
 
7
 
8
  # Intelligent Book Recommendation System
9
 
10
+ *Frozen at v2.6.0 (model/features baseline) — maintenance mode for portfolio use. Post-freeze updates are limited to bug fixes, refactor, and documentation.*
11
 
12
  ## Problem
13
 
 
64
 
65
  ## Documentation
66
 
67
+ Start from the internal documentation hub: [`docs/README.md`](docs/README.md).
68
+
69
  | Doc | Purpose |
70
  |:---|:---|
71
+ | [Docs Hub](docs/README.md) | Canonical documentation navigation |
72
+ | [Technical Report](docs/architecture/TECHNICAL_REPORT.md) | Architecture and design decisions |
73
+ | [Experiment Archive](docs/experiments/experiment_archive.md) | Consolidated experiment log (V1.0 → v2.6.0) |
74
+ | [Development Guide](docs/development/DEVELOPMENT.md) | Engineering playbook for extension/refactor |
75
+ | [Build Guide](docs/development/build_guide.md) | Build and deployment instructions |
76
+ | [Changelog](CHANGELOG.md) | Versioned change history |
77
 
78
  ## License
79
 
books_cleaned.csv → data/legacy_root_exports/books_cleaned.csv RENAMED
File without changes
books_descriptions.csv → data/legacy_root_exports/books_descriptions.csv RENAMED
File without changes
books_descriptions.txt → data/legacy_root_exports/books_descriptions.txt RENAMED
File without changes
books_with_emotions.csv → data/legacy_root_exports/books_with_emotions.csv RENAMED
File without changes
docs/README.md CHANGED
@@ -1,33 +1,65 @@
1
- # Project Documentation
2
 
3
- ## Layer 1 Main Story (README, 5-min interview)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
 
5
  | Document | Purpose |
6
  |:---|:---|
7
- | [Technical Report](TECHNICAL_REPORT.md) | Architecture, design decisions, method line |
8
- | [Architecture Diagrams](ARCHITECTURE_DIAGRAMS.md) | 流程图、时序图、ER 图(Mermaid) |
9
- | [Experiment Archive](experiments/experiment_archive.md) | Consolidated experiment log (V1.0 → v2.6.0) |
 
 
 
 
10
 
11
- ## Layer 2 — Capability Showcase (Resume, technical Q&A)
12
 
13
  | Document | Purpose |
14
  |:---|:---|
15
- | [Development Guide](DEVELOPMENT.md) | 添加召回通道、调整路由规则 |
16
- | [Contributing](../CONTRIBUTING.md) | 贡献者指南 |
17
- | [Interview Guide](interview_guide.md) | Q&A, STAR cases |
18
- | [Memory Optimization](memory_optimization.md) | Zero-RAM SQLite, engineering decisions |
19
- | [Performance Debugging](performance_debugging_report.md) | Root cause analysis |
20
- | [Build Guide](build_guide.md) | Full build pipeline |
21
- | [Hugging Face Deployment](huggingface_deployment.md) | HF Spaces deployment |
22
 
23
- ## Archives
24
 
25
- | Path | Contents |
26
  |:---|:---|
27
- | [archived/](archived/) | Deprecated docs (Phase 2, TAGS, REVIEW_HIGHLIGHTS, etc.) |
28
- | [archived/graveyard/](archived/graveyard/) | Layer 3 tried but not in main story (future_roadmap, interview_deep_dive, etc.) |
29
- | [experiments/reports/](experiments/reports/) | Raw experiment reports (baseline, hybrid, rerank, router, temporal) |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
30
 
31
  ---
32
 
33
- **Frozen v2.6.0** HR@10 = 0.4545, MRR@5 = 0.2893
 
1
+ # Project Documentation Hub
2
 
3
+ This file is the canonical navigation entry for project documentation.
4
+
5
+ ## Current Status
6
+
7
+ - **Baseline**: v2.6.0 is frozen for portfolio use.
8
+ - **Allowed updates**: bug fixes, engineering refactor, documentation cleanup.
9
+ - **Not in scope**: new model/features or new experiment tracks.
10
+
11
+ ## 1) Start Here (Most readers)
12
+
13
+ | Document | Audience | Purpose |
14
+ |:---|:---|:---|
15
+ | [../README.md](../README.md) | Recruiter / first-time visitor | 5-minute project overview and quick start |
16
+ | [architecture/TECHNICAL_REPORT.md](architecture/TECHNICAL_REPORT.md) | Engineer / interviewer | Full architecture and design decisions |
17
+ | [experiments/experiment_archive.md](experiments/experiment_archive.md) | Research / reviewer | Consolidated experiment timeline and conclusions |
18
+ | [../CHANGELOG.md](../CHANGELOG.md) | Maintainer | Versioned change history |
19
+
20
+ ## 2) Engineering Guides (Contributors)
21
 
22
  | Document | Purpose |
23
  |:---|:---|
24
+ | [development/DEVELOPMENT.md](development/DEVELOPMENT.md) | Extend recall/ranking/router and maintain pipeline |
25
+ | [development/build_guide.md](development/build_guide.md) | Local build and service startup pipeline |
26
+ | [development/huggingface_deployment.md](development/huggingface_deployment.md) | HF Spaces deployment notes |
27
+ | [performance/memory_optimization.md](performance/memory_optimization.md) | Zero-RAM SQLite and memory trade-offs |
28
+ | [performance/LATENCY_OPTIMIZATION.md](performance/LATENCY_OPTIMIZATION.md) | Latency tuning options and trade-offs |
29
+ | [performance/performance_debugging_report.md](performance/performance_debugging_report.md) | Root-cause analysis and debugging playbook |
30
+ | [../CONTRIBUTING.md](../CONTRIBUTING.md) | Contribution guidelines |
31
 
32
+ ## 3) Presentation Material
33
 
34
  | Document | Purpose |
35
  |:---|:---|
36
+ | [presentation/interview_guide.md](presentation/interview_guide.md) | Interview Q&A and STAR examples |
37
+ | [presentation/roadmap.md](presentation/roadmap.md) | Technical evolution and vision gap |
 
 
 
 
 
38
 
39
+ ## 4) Archives and Raw Reports
40
 
41
+ | Path | Purpose |
42
  |:---|:---|
43
+ | [archived/README.md](archived/README.md) | Archived document index with reason tags |
44
+ | [archived/](archived/) | Deprecated or superseded docs |
45
+ | [experiments/reports/](experiments/reports/) | Raw experiment reports |
46
+ | [../data/legacy_root_exports/](../data/legacy_root_exports/) | Historical root-level data snapshots moved out of repo root |
47
+
48
+ ## Documentation Rules
49
+
50
+ Each active document should include a short metadata block near the top:
51
+
52
+ - `Status`: `active` | `frozen` | `deprecated` | `archived`
53
+ - `Audience`: target reader
54
+ - `Last Updated`: `YYYY-MM-DD`
55
+ - `Owner`: maintainer name
56
+
57
+ When code behavior changes, update:
58
+
59
+ 1. `CHANGELOG.md`
60
+ 2. One topic document in `docs/`
61
+ 3. This file (`docs/README.md`) if navigation changed
62
 
63
  ---
64
 
65
+ **Frozen baseline metrics (v2.6.0):** HR@10 = 0.4545, MRR@5 = 0.2893
docs/{ARCHITECTURE_DIAGRAMS.md → architecture/ARCHITECTURE_DIAGRAMS.md} RENAMED
@@ -347,5 +347,5 @@ flowchart TB
347
  - **命令行**: 使用 `mmdc` (mermaid-cli) 导出 PNG/SVG
348
  ```bash
349
  npm install -g @mermaid-js/mermaid-cli
350
- mmdc -i docs/ARCHITECTURE_DIAGRAMS.md -o docs/diagrams/
351
  ```
 
347
  - **命令行**: 使用 `mmdc` (mermaid-cli) 导出 PNG/SVG
348
  ```bash
349
  npm install -g @mermaid-js/mermaid-cli
350
+ mmdc -i docs/architecture/ARCHITECTURE_DIAGRAMS.md -o docs/diagrams/
351
  ```
docs/{ARCHITECTURE_IMPROVEMENTS.md → architecture/ARCHITECTURE_IMPROVEMENTS.md} RENAMED
File without changes
docs/{ROUTER_OPTIMIZATION.md → architecture/ROUTER_OPTIMIZATION.md} RENAMED
File without changes
docs/{TECHNICAL_REPORT.md → architecture/TECHNICAL_REPORT.md} RENAMED
File without changes
docs/archived/README.md ADDED
@@ -0,0 +1,48 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Archived Documentation Index
2
+
3
+ Use this index to find deprecated/superseded materials quickly.
4
+
5
+ ## Status Semantics
6
+
7
+ - `deprecated`: replaced by newer canonical docs
8
+ - `archived`: historical snapshot retained for traceability
9
+ - `reference-only`: useful background, not part of current main story
10
+
11
+ ## Archived at `docs/archived/`
12
+
13
+ | Document | Status | Reason |
14
+ |:---|:---|:---|
15
+ | [DEPLOYMENT.md](DEPLOYMENT.md) | deprecated | Replaced by [`../development/build_guide.md`](../development/build_guide.md) and [`../development/huggingface_deployment.md`](../development/huggingface_deployment.md) |
16
+ | [PHASE_2_DEVELOPMENT.md](PHASE_2_DEVELOPMENT.md) | archived | Phase-specific implementation log |
17
+ | [REVIEW_HIGHLIGHTS.md](REVIEW_HIGHLIGHTS.md) | archived | Feature-specific historical note |
18
+ | [TAGS_AND_EMOTIONS.md](TAGS_AND_EMOTIONS.md) | archived | Earlier tagging/emotion design notes |
19
+ | [interview_prep_v1.md](interview_prep_v1.md) | deprecated | Superseded by [`../presentation/interview_guide.md`](../presentation/interview_guide.md) |
20
+ | [story_and_strategy.md](story_and_strategy.md) | reference-only | Narrative framing draft, not canonical technical source |
21
+
22
+ ## Archived assets
23
+
24
+ | Path | Status | Reason |
25
+ |:---|:---|:---|
26
+ | [assets/](assets/) | archived | Non-source artifacts kept for historical record |
27
+
28
+ ## Archived at `docs/archived/graveyard/`
29
+
30
+ These files are exploratory drafts and intermediate analyses.
31
+
32
+ | Document | Status | Reason |
33
+ |:---|:---|:---|
34
+ | [DEPLOYMENT.md](graveyard/DEPLOYMENT.md) | archived | Early deployment draft |
35
+ | [business_logic.md](graveyard/business_logic.md) | archived | Early business-logic decomposition draft |
36
+ | [future_roadmap.md](graveyard/future_roadmap.md) | reference-only | Long-horizon ideas not in frozen scope |
37
+ | [interview_deep_dive.md](graveyard/interview_deep_dive.md) | reference-only | Expanded interview draft |
38
+ | [interview_prep.md](graveyard/interview_prep.md) | deprecated | Replaced by active interview guide |
39
+ | [phase7_plan.md](graveyard/phase7_plan.md) | archived | Phase plan snapshot after implementation |
40
+ | [project_analysis.md](graveyard/project_analysis.md) | archived | Intermediate analysis notes |
41
+ | [project_narrative.md](graveyard/project_narrative.md) | reference-only | Storytelling draft |
42
+ | [rag_architecture.md](graveyard/rag_architecture.md) | deprecated | Replaced by technical report and diagrams |
43
+ | [technical_architecture.md](graveyard/technical_architecture.md) | deprecated | Replaced by technical report |
44
+ | [technical_deep_dive_sota.md](graveyard/technical_deep_dive_sota.md) | reference-only | Background survey notes |
45
+
46
+ ---
47
+
48
+ If a file in this index should be restored to active status, move it out of `archived/` and add it to [`../README.md`](../README.md).
代码评审报告.docx → docs/archived/assets/代码评审报告.docx RENAMED
File without changes
docs/{DEVELOPMENT.md → development/DEVELOPMENT.md} RENAMED
File without changes
docs/{build_guide.md → development/build_guide.md} RENAMED
File without changes
docs/{huggingface_deployment.md → development/huggingface_deployment.md} RENAMED
File without changes
docs/{LATENCY_OPTIMIZATION.md → performance/LATENCY_OPTIMIZATION.md} RENAMED
File without changes
docs/{TEST_COVERAGE.md → performance/TEST_COVERAGE.md} RENAMED
File without changes
docs/{memory_optimization.md → performance/memory_optimization.md} RENAMED
File without changes
docs/{performance_debugging_report.md → performance/performance_debugging_report.md} RENAMED
File without changes
docs/{interview_guide.md → presentation/interview_guide.md} RENAMED
File without changes
docs/{roadmap.md → presentation/roadmap.md} RENAMED
File without changes
server_final.log DELETED
@@ -1,42 +0,0 @@
1
- uvicorn src.main:app --reload --port 6006
2
- INFO: Will watch for changes in these directories: ['/Users/ymlin/Downloads/003-Study/138-Projects/08-book-rec-with-LLMs']
3
- INFO: Uvicorn running on http://127.0.0.1:6006 (Press CTRL+C to quit)
4
- INFO: Started reloader process [9672] using WatchFiles
5
- 2026-01-10 23:12:04,451 - src.services.chat_service - INFO - ChatService: Loading books data for context retrieval...
6
- 2026-01-10 23:12:04,451 - src.etl - INFO - Loading processed data from /Users/ymlin/Downloads/003-Study/138-Projects/08-book-rec-with-LLMs/data/books_processed.csv
7
- INFO: Started server process [9726]
8
- INFO: Waiting for application startup.
9
- 2026-01-10 23:12:08,624 - src.main - INFO - Initializing Recommender Engine...
10
- 2026-01-10 23:12:08,624 - src.etl - INFO - Loading processed data from /Users/ymlin/Downloads/003-Study/138-Projects/08-book-rec-with-LLMs/data/books_processed.csv
11
- 2026-01-10 23:12:12,766 - src.recommender - INFO - Loaded books DataFrame with columns: ['isbn13', 'title', 'description', 'average_rating', 'authors', 'thumbnail', 'simple_categories', 'joy', 'sadness', 'fear', 'anger', 'surprise', 'tags', 'review_highlights', 'large_thumbnail']
12
- 2026-01-10 23:12:12,766 - src.vector_db - INFO - Loading embedding model: sentence-transformers/all-MiniLM-L6-v2
13
- /Users/ymlin/Downloads/003-Study/138-Projects/08-book-rec-with-LLMs/src/vector_db.py:36: LangChainDeprecationWarning: The class `HuggingFaceEmbeddings` was deprecated in LangChain 0.2.2 and will be removed in 1.0. An updated version of the class exists in the `langchain-huggingface package and should be used instead. To use it run `pip install -U `langchain-huggingface` and import as `from `langchain_huggingface import HuggingFaceEmbeddings``.
14
- self.embeddings = HuggingFaceEmbeddings(
15
- 2026-01-10 23:12:14,865 - src.vector_db - INFO - Embedding model loaded successfully
16
- 2026-01-10 23:12:14,865 - src.vector_db - INFO - Loading existing vector database from /Users/ymlin/Downloads/003-Study/138-Projects/08-book-rec-with-LLMs/data/chroma_db
17
- /Users/ymlin/Downloads/003-Study/138-Projects/08-book-rec-with-LLMs/src/vector_db.py:45: LangChainDeprecationWarning: The class `Chroma` was deprecated in LangChain 0.2.9 and will be removed in 1.0. An updated version of the class exists in the `langchain-chroma package and should be used instead. To use it run `pip install -U `langchain-chroma` and import as `from `langchain_chroma import Chroma``.
18
- self.db = Chroma(
19
- 2026-01-10 23:12:17,822 - src.vector_db - INFO - Loaded 222003 documents from vector database
20
- 2026-01-10 23:12:17,824 - src.vector_db - INFO - Initializing BM25 Index (Sparse Retrieval)...
21
- 2026-01-10 23:12:35,315 - src.vector_db - INFO - BM25 Index built with 222005 documents.
22
- 2026-01-10 23:12:41,192 - src.vector_db - INFO - Loaded 186954 publication dates for Temporal Scoring.
23
- Redis cache initialization failed: Error 61 connecting to localhost:6379. Connection refused.. Caching disabled.
24
- 2026-01-10 23:12:44,213 - src.recommender - INFO - Loaded 212398 book images from books_basic_info.csv
25
- 2026-01-10 23:12:44,213 - src.main - INFO - Initializing Personalized Rec Service...
26
- Failed to load YoutubeDNN: [Errno 2] No such file or directory: 'data/model/recall/youtube_dnn.pt'
27
- Failed to load YoutubeDNN: [Errno 2] No such file or directory: 'data/model/recall/youtube_dnn.pt'
28
- 2026-01-10 23:13:30,405 - src.main - INFO - Engines Initialized.
29
- INFO: Application startup complete.
30
- INFO: 127.0.0.1:58450 - "GET /api/recommend/personal?user_id=demo&top_k=1 HTTP/1.1" 200 OK
31
- INFO: 127.0.0.1:58469 - "GET /favorites/list/local HTTP/1.1" 200 OK
32
- INFO: 127.0.0.1:58470 - "GET /user/local/stats HTTP/1.1" 200 OK
33
- INFO: 127.0.0.1:58471 - "GET /api/recommend/personal?user_id=local&limit=20 HTTP/1.1" 200 OK
34
- INFO: 127.0.0.1:58469 - "GET /user/local/stats HTTP/1.1" 200 OK
35
- INFO: 127.0.0.1:58472 - "GET /favorites/list/local HTTP/1.1" 200 OK
36
- INFO: 127.0.0.1:58470 - "GET /api/recommend/personal?user_id=local&limit=20 HTTP/1.1" 200 OK
37
- INFO: 127.0.0.1:58470 - "OPTIONS /marketing/highlights HTTP/1.1" 200 OK
38
- INFO: 127.0.0.1:58470 - "POST /marketing/highlights HTTP/1.1" 200 OK
39
- INFO: 127.0.0.1:58470 - "POST /marketing/highlights HTTP/1.1" 200 OK
40
- INFO: 127.0.0.1:58470 - "POST /marketing/highlights HTTP/1.1" 200 OK
41
- INFO: 127.0.0.1:58470 - "POST /marketing/highlights HTTP/1.1" 200 OK
42
- make: *** [run] Hangup: 1