ymlin105 commited on
Commit
52a0642
·
1 Parent(s): 65b86c6

feat: enhance recommendation system with improved routing, latency optimizations, and onboarding features

Browse files
CHANGELOG.md CHANGED
@@ -11,7 +11,25 @@ All notable changes to this project will be documented in this file.
11
 
12
  ## [Unreleased]
13
 
14
- *No changes project frozen at v2.6.0*
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
15
 
16
  ### Added - 2026-01-29 (Frontend Refactor: React Router SPA)
17
  - **React Router SPA**: Refactored monolithic 960-line `App.jsx` into React Router architecture with 3 route pages and 5 reusable components.
 
11
 
12
  ## [Unreleased]
13
 
14
+ ### Fixed - Router heuristic fragility (intent_classifier)
15
+ - **Model-based routing**: Trained `intent_classifier.pkl` (TF-IDF + LogisticRegression) with book title examples; router now uses model when available.
16
+ - **SEED_DATA extended**: Added `War and Peace`, `The Lord of the Rings`, `Harry Potter`, `1984`, etc. (fast) and `books like War and Peace`, `similar to The Lord of the Rings` (deep) so model distinguishes book titles from recommendation-style queries.
17
+ - **Fallback rules improved**: Replaced brittle `len(words) <= 2` with NL keyword detection (`ROUTER_NL_KEYWORDS`: like, similar, recommend, want, looking, ...). Short queries (≤6 words) without NL keywords → FAST; queries with NL keywords → DEEP.
18
+ - **Config**: `natural_language_keywords` in router config; `ROUTER_NL_KEYWORDS` in `src/config.py`.
19
+
20
+ ### Added - Latency Optimizations (LATENCY_OPTIMIZATION.md)
21
+ - **1. 裁剪候选集**: `RERANK_CANDIDATES_MAX=20` (env overridable); rerank top 20 instead of 50.
22
+ - **2. ColBERT**: `RERANKER_BACKEND=colbert`; optional `llama-index-postprocessor-colbert-rerank`.
23
+ - **3. Rerank 异步化**: `fast=true` skips rerank (~150ms); `async_rerank=true` returns RRF first, reranks in background, next request gets cached reranked.
24
+ - **4. ONNX 量化**: `RERANKER_BACKEND=onnx` (default); `onnxruntime` for ~2x CrossEncoder speedup.
25
+ - API: `POST /recommend` accepts `fast`, `async_rerank`; `web/src/api.js` updated.
26
+
27
+ ### Added - Cold-Start Optimizations (P0–P2)
28
+ - **P0**: Popularity fallback — enabled Popularity channel by default; `RecallFusion` and `RecommendationService` fallback to popular books when all recall channels return empty.
29
+ - **P0**: `recent_isbns` API param — `/api/recommend/personal` accepts comma-separated ISBNs from current session; injected into SASRec for 1-click cold-start convergence.
30
+ - **P1**: Frontend passes `recent_isbns` — session-level tracking of viewed books; passed to personalized API on Start Discovery.
31
+ - **P2**: Onboarding flow — `OnboardingModal` when new user (no collection); pick 3–5 books from popular list to seed preferences; `GET /api/onboarding/books`.
32
+ - **P2**: Zero-shot intent probing — `src/core/intent_prober.py` uses LLM to infer categories/emotions/keywords from user query; `GET /api/intent/probe`; `intent_query` param on personal API seeds SASRec via semantic search when user has no history.
33
 
34
  ### Added - 2026-01-29 (Frontend Refactor: React Router SPA)
35
  - **React Router SPA**: Refactored monolithic 960-line `App.jsx` into React Router architecture with 3 route pages and 5 reusable components.
benchmarks/benchmark.py CHANGED
@@ -4,15 +4,23 @@ Performance Benchmark Script for Book Recommender System
4
  This script measures:
5
  1. Vector search latency
6
  2. End-to-end recommendation latency
7
- 3. Throughput (queries per second)
 
8
 
9
  Usage:
10
  python benchmarks/benchmark.py
 
 
 
 
 
11
  """
12
 
 
13
  import sys
14
  import time
15
  import statistics
 
16
  from pathlib import Path
17
 
18
  # Add project root to path
@@ -82,11 +90,11 @@ def benchmark_full_recommendation(recommender: BookRecommender, n_runs: int = 30
82
 
83
 
84
  def benchmark_throughput(recommender: BookRecommender, duration_sec: int = 10) -> dict:
85
- """Measure queries per second over a time window."""
86
  query_count = 0
87
  start = time.perf_counter()
88
  query_idx = 0
89
-
90
  while (time.perf_counter() - start) < duration_sec:
91
  recommender.get_recommendations_sync(
92
  TEST_QUERIES[query_idx % len(TEST_QUERIES)],
@@ -95,17 +103,63 @@ def benchmark_throughput(recommender: BookRecommender, duration_sec: int = 10) -
95
  )
96
  query_count += 1
97
  query_idx += 1
98
-
99
  elapsed = time.perf_counter() - start
100
-
101
  return {
102
- "operation": "Throughput Test",
103
  "duration_sec": round(elapsed, 2),
104
  "total_queries": query_count,
105
  "qps": round(query_count / elapsed, 2),
106
  }
107
 
108
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
109
  def print_results(results: list[dict]):
110
  """Print benchmark results in a formatted table."""
111
  print("\n" + "=" * 70)
@@ -146,37 +200,65 @@ def save_results(results: list[dict], filepath: str = "benchmarks/results.md"):
146
  f.write("## Interpretation\n\n")
147
  f.write("- **Vector Search**: Time to query ChromaDB and retrieve top-k results\n")
148
  f.write("- **Full Recommendation**: End-to-end latency including filtering and formatting\n")
149
- f.write("- **Throughput**: Sustained queries per second under load\n")
 
150
 
151
  print(f"\n✅ Results saved to {filepath}")
152
 
153
 
154
  def main():
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
155
  print("🚀 Initializing Book Recommender System...")
156
  print(" (This may take a moment to load models and vector database)")
157
-
158
  try:
159
  recommender = BookRecommender()
160
  except Exception as e:
161
  print(f"❌ Failed to initialize: {e}")
162
  return
163
-
164
  print("✅ System initialized. Starting benchmarks...\n")
165
-
166
  results = []
167
-
168
  # Benchmark 1: Vector Search
169
  print("📊 Running Vector Search benchmark...")
170
  results.append(benchmark_vector_search(recommender.vector_db))
171
-
172
  # Benchmark 2: Full Recommendation
173
  print("📊 Running Full Recommendation benchmark...")
174
  results.append(benchmark_full_recommendation(recommender))
175
-
176
- # Benchmark 3: Throughput
177
- print("📊 Running Throughput benchmark (10 seconds)...")
178
  results.append(benchmark_throughput(recommender))
179
-
 
 
 
 
 
 
 
 
 
 
 
180
  # Print and save results
181
  print_results(results)
182
  save_results(results)
 
4
  This script measures:
5
  1. Vector search latency
6
  2. End-to-end recommendation latency
7
+ 3. Throughput (queries per second, sequential)
8
+ 4. Concurrent throughput (QPS under N parallel workers)
9
 
10
  Usage:
11
  python benchmarks/benchmark.py
12
+ python benchmarks/benchmark.py --concurrent 5 # 5 concurrent workers
13
+
14
+ Note: For HTTP-level load testing (simulating real users), use Locust:
15
+ pip install locust
16
+ locust -f benchmarks/locustfile.py --host=http://localhost:8000
17
  """
18
 
19
+ import argparse
20
  import sys
21
  import time
22
  import statistics
23
+ from concurrent.futures import ThreadPoolExecutor, as_completed
24
  from pathlib import Path
25
 
26
  # Add project root to path
 
90
 
91
 
92
  def benchmark_throughput(recommender: BookRecommender, duration_sec: int = 10) -> dict:
93
+ """Measure queries per second over a time window (sequential)."""
94
  query_count = 0
95
  start = time.perf_counter()
96
  query_idx = 0
97
+
98
  while (time.perf_counter() - start) < duration_sec:
99
  recommender.get_recommendations_sync(
100
  TEST_QUERIES[query_idx % len(TEST_QUERIES)],
 
103
  )
104
  query_count += 1
105
  query_idx += 1
106
+
107
  elapsed = time.perf_counter() - start
108
+
109
  return {
110
+ "operation": "Throughput Test (sequential)",
111
  "duration_sec": round(elapsed, 2),
112
  "total_queries": query_count,
113
  "qps": round(query_count / elapsed, 2),
114
  }
115
 
116
 
117
+ def _run_one_query(recommender: BookRecommender, query: str) -> tuple[float, int]:
118
+ """Run a single recommendation and return (latency_ms, 1)."""
119
+ start = time.perf_counter()
120
+ recommender.get_recommendations_sync(query, category="All", tone="All")
121
+ return (time.perf_counter() - start) * 1000, 1
122
+
123
+
124
+ def benchmark_concurrent(
125
+ recommender: BookRecommender,
126
+ n_workers: int = 5,
127
+ total_queries: int = 50,
128
+ ) -> dict:
129
+ """
130
+ Measure throughput under concurrent load using ThreadPoolExecutor.
131
+
132
+ Simulates N parallel clients to expose:
133
+ - VectorDB connection/query limits under load
134
+ - GIL contention if CPU-bound (embedding, rerank)
135
+ - I/O blocking in ChromaDB / LLM calls
136
+ """
137
+ queries = [TEST_QUERIES[i % len(TEST_QUERIES)] for i in range(total_queries)]
138
+ latencies: list[float] = []
139
+ start = time.perf_counter()
140
+
141
+ with ThreadPoolExecutor(max_workers=n_workers) as executor:
142
+ futures = [
143
+ executor.submit(_run_one_query, recommender, q) for q in queries
144
+ ]
145
+ for future in as_completed(futures):
146
+ lat_ms, _ = future.result()
147
+ latencies.append(lat_ms)
148
+
149
+ wall_sec = time.perf_counter() - start
150
+
151
+ return {
152
+ "operation": f"Concurrent Throughput ({n_workers} workers)",
153
+ "workers": n_workers,
154
+ "total_queries": total_queries,
155
+ "wall_sec": round(wall_sec, 2),
156
+ "qps": round(total_queries / wall_sec, 2),
157
+ "mean_latency_ms": round(statistics.mean(latencies), 2),
158
+ "median_latency_ms": round(statistics.median(latencies), 2),
159
+ "p95_latency_ms": round(sorted(latencies)[int(len(latencies) * 0.95)], 2),
160
+ }
161
+
162
+
163
  def print_results(results: list[dict]):
164
  """Print benchmark results in a formatted table."""
165
  print("\n" + "=" * 70)
 
200
  f.write("## Interpretation\n\n")
201
  f.write("- **Vector Search**: Time to query ChromaDB and retrieve top-k results\n")
202
  f.write("- **Full Recommendation**: End-to-end latency including filtering and formatting\n")
203
+ f.write("- **Throughput (sequential)**: Sustained QPS when processing one query at a time\n")
204
+ f.write("- **Concurrent Throughput**: QPS under N parallel workers; exposes GIL/IO bottlenecks\n")
205
 
206
  print(f"\n✅ Results saved to {filepath}")
207
 
208
 
209
  def main():
210
+ parser = argparse.ArgumentParser(description="Benchmark Book Recommender System")
211
+ parser.add_argument(
212
+ "--concurrent",
213
+ type=int,
214
+ default=0,
215
+ metavar="N",
216
+ help="Add concurrent benchmark with N workers (e.g. 5). 0 = skip.",
217
+ )
218
+ parser.add_argument(
219
+ "--concurrent-queries",
220
+ type=int,
221
+ default=50,
222
+ help="Total queries for concurrent benchmark (default: 50)",
223
+ )
224
+ args = parser.parse_args()
225
+
226
  print("🚀 Initializing Book Recommender System...")
227
  print(" (This may take a moment to load models and vector database)")
228
+
229
  try:
230
  recommender = BookRecommender()
231
  except Exception as e:
232
  print(f"❌ Failed to initialize: {e}")
233
  return
234
+
235
  print("✅ System initialized. Starting benchmarks...\n")
236
+
237
  results = []
238
+
239
  # Benchmark 1: Vector Search
240
  print("📊 Running Vector Search benchmark...")
241
  results.append(benchmark_vector_search(recommender.vector_db))
242
+
243
  # Benchmark 2: Full Recommendation
244
  print("📊 Running Full Recommendation benchmark...")
245
  results.append(benchmark_full_recommendation(recommender))
246
+
247
+ # Benchmark 3: Sequential Throughput
248
+ print("📊 Running Sequential Throughput benchmark (10 seconds)...")
249
  results.append(benchmark_throughput(recommender))
250
+
251
+ # Benchmark 4: Concurrent Throughput (optional)
252
+ if args.concurrent > 0:
253
+ print(f"📊 Running Concurrent Throughput ({args.concurrent} workers, {args.concurrent_queries} queries)...")
254
+ results.append(
255
+ benchmark_concurrent(
256
+ recommender,
257
+ n_workers=args.concurrent,
258
+ total_queries=args.concurrent_queries,
259
+ )
260
+ )
261
+
262
  # Print and save results
263
  print_results(results)
264
  save_results(results)
benchmarks/locustfile.py ADDED
@@ -0,0 +1,48 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Locust load test for Book Recommender API.
3
+
4
+ Simulates concurrent HTTP requests to measure real-world throughput.
5
+ Run API server first, then:
6
+
7
+ pip install locust
8
+ locust -f benchmarks/locustfile.py --host=http://localhost:8000
9
+
10
+ Then open http://localhost:8089 to drive the load test.
11
+ """
12
+
13
+ import random
14
+ from locust import HttpUser, task, between
15
+
16
+ # Mirror TEST_QUERIES from benchmark.py for consistency
17
+ TEST_QUERIES = [
18
+ "a romantic comedy set in New York",
19
+ "a philosophical novel about the meaning of life",
20
+ "a fast-paced thriller with plot twists",
21
+ "a coming-of-age story about friendship and loss",
22
+ "a historical fiction set during World War II",
23
+ "a science fiction story about space exploration",
24
+ "a mystery novel with an unreliable narrator",
25
+ "a fantasy epic with dragons and magic",
26
+ "a memoir about overcoming adversity",
27
+ "a literary fiction exploring family dynamics",
28
+ ]
29
+
30
+
31
+ class RecommenderUser(HttpUser):
32
+ """Simulates a user hitting the recommendation API."""
33
+
34
+ wait_time = between(0.5, 2.0) # 0.5–2s between requests
35
+
36
+ @task(10)
37
+ def recommend(self):
38
+ """Primary: POST /recommend."""
39
+ q = random.choice(TEST_QUERIES)
40
+ self.client.post(
41
+ "/recommend",
42
+ json={"query": q, "category": "All"},
43
+ )
44
+
45
+ @task(1)
46
+ def health(self):
47
+ """Occasional health check."""
48
+ self.client.get("/health")
benchmarks/results.md ADDED
@@ -0,0 +1,61 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Performance Benchmark Results
2
+
3
+ **Date**: 2026-02-12 01:02:27
4
+
5
+ ## System Info
6
+ - Dataset: 5,000+ books
7
+ - Embedding Model: all-MiniLM-L6-v2 (384 dim)
8
+ - Vector DB: ChromaDB with HNSW index
9
+
10
+ ## Results
11
+
12
+ ### Vector Search (k=50)
13
+
14
+ | Metric | Value |
15
+ |--------|-------|
16
+ | runs | 50 |
17
+ | mean_ms | 11.49 |
18
+ | median_ms | 6.43 |
19
+ | std_ms | 27.41 |
20
+ | min_ms | 5.49 |
21
+ | max_ms | 200.46 |
22
+ | p95_ms | 15.51 |
23
+
24
+ ### Full Recommendation
25
+
26
+ | Metric | Value |
27
+ |--------|-------|
28
+ | runs | 30 |
29
+ | mean_ms | 3876.27 |
30
+ | median_ms | 260.87 |
31
+ | std_ms | 5445.93 |
32
+ | min_ms | 14.54 |
33
+ | max_ms | 16609.18 |
34
+ | p95_ms | 11694.41 |
35
+
36
+ ### Throughput Test (sequential)
37
+
38
+ | Metric | Value |
39
+ |--------|-------|
40
+ | duration_sec | 10.1 |
41
+ | total_queries | 89 |
42
+ | qps | 8.81 |
43
+
44
+ ### Concurrent Throughput (3 workers)
45
+
46
+ | Metric | Value |
47
+ |--------|-------|
48
+ | workers | 3 |
49
+ | total_queries | 12 |
50
+ | wall_sec | 1.29 |
51
+ | qps | 9.28 |
52
+ | mean_latency_ms | 298.3 |
53
+ | median_latency_ms | 370.19 |
54
+ | p95_latency_ms | 579.95 |
55
+
56
+ ## Interpretation
57
+
58
+ - **Vector Search**: Time to query ChromaDB and retrieve top-k results
59
+ - **Full Recommendation**: End-to-end latency including filtering and formatting
60
+ - **Throughput (sequential)**: Sustained QPS when processing one query at a time
61
+ - **Concurrent Throughput**: QPS under N parallel workers; exposes GIL/IO bottlenecks
benchmarks/test_concurrent_benchmark.py ADDED
@@ -0,0 +1,73 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Quick test for concurrent benchmark logic without loading full recommender.
3
+ Run: python benchmarks/test_concurrent_benchmark.py
4
+ """
5
+
6
+ import sys
7
+ import time
8
+ import statistics
9
+ from concurrent.futures import ThreadPoolExecutor, as_completed
10
+ from pathlib import Path
11
+
12
+ sys.path.insert(0, str(Path(__file__).parent.parent))
13
+
14
+ # Mock recommender that simulates ~100ms latency
15
+ class MockRecommender:
16
+ def get_recommendations_sync(self, query: str, category: str = "All", tone: str = "All"):
17
+ time.sleep(0.1)
18
+ return [{"title": "Mock Book", "isbn": "123"}]
19
+
20
+ TEST_QUERIES = ["query A", "query B", "query C"]
21
+
22
+
23
+ def _run_one_query(recommender, query: str) -> tuple[float, int]:
24
+ start = time.perf_counter()
25
+ recommender.get_recommendations_sync(query, category="All", tone="All")
26
+ return (time.perf_counter() - start) * 1000, 1
27
+
28
+
29
+ def benchmark_concurrent(recommender, n_workers: int = 5, total_queries: int = 15) -> dict:
30
+ queries = [TEST_QUERIES[i % len(TEST_QUERIES)] for i in range(total_queries)]
31
+ latencies = []
32
+ start = time.perf_counter()
33
+
34
+ with ThreadPoolExecutor(max_workers=n_workers) as executor:
35
+ futures = [executor.submit(_run_one_query, recommender, q) for q in queries]
36
+ for future in as_completed(futures):
37
+ lat_ms, _ = future.result()
38
+ latencies.append(lat_ms)
39
+
40
+ wall_sec = time.perf_counter() - start
41
+
42
+ return {
43
+ "operation": f"Concurrent ({n_workers} workers)",
44
+ "workers": n_workers,
45
+ "total_queries": total_queries,
46
+ "wall_sec": round(wall_sec, 2),
47
+ "qps": round(total_queries / wall_sec, 2),
48
+ "mean_latency_ms": round(statistics.mean(latencies), 2),
49
+ }
50
+
51
+
52
+ def main():
53
+ mock = MockRecommender()
54
+
55
+ # Sequential: 15 * 100ms = ~1.5s
56
+ print("Sequential (1 worker):")
57
+ r1 = benchmark_concurrent(mock, n_workers=1, total_queries=15)
58
+ print(f" wall_sec={r1['wall_sec']}, qps={r1['qps']}, mean_ms={r1['mean_latency_ms']}")
59
+
60
+ # Concurrent: 15 queries with 5 workers -> ~3 batches of 5 -> ~300ms
61
+ print("\nConcurrent (5 workers):")
62
+ r5 = benchmark_concurrent(mock, n_workers=5, total_queries=15)
63
+ print(f" wall_sec={r5['wall_sec']}, qps={r5['qps']}, mean_ms={r5['mean_latency_ms']}")
64
+
65
+ # Concurrency should give ~5x speedup
66
+ speedup = r1["wall_sec"] / r5["wall_sec"]
67
+ print(f"\nSpeedup: {speedup:.1f}x (expected ~5x for 5 workers)")
68
+ assert r5["qps"] > r1["qps"], "Concurrent QPS should exceed sequential"
69
+ print("OK: Concurrent benchmark logic works correctly.")
70
+
71
+
72
+ if __name__ == "__main__":
73
+ main()
docs/LATENCY_OPTIMIZATION.md ADDED
@@ -0,0 +1,130 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Latency Optimization: Full Recommendation Pipeline
2
+
3
+ ## Current State
4
+
5
+ | Metric | Value | Target (Spotify-style) |
6
+ |--------|-------|------------------------|
7
+ | P95 Full Recommendation | ~1250ms | < 100ms |
8
+ | Mean | ~700–900ms | - |
9
+
10
+ **面试官点评**: 在 Spotify,推荐接口通常要在 100ms 内返回。1.2s 对用户来说是可以感知的卡顿。
11
+
12
+ ---
13
+
14
+ ## Latency Breakdown
15
+
16
+ Approximate warm-query breakdown (from `benchmarks/benchmark.py` + `docs/experiments/reports/rerank_report.md`):
17
+
18
+ | Stage | Location | Latency | Notes |
19
+ |-------|----------|---------|-------|
20
+ | Router | `src/core/router.py` | ~1ms | Rule-based, fast |
21
+ | Sparse (FTS5) | `vector_db._sparse_fts_search` | ~20–50ms | SQLite MATCH |
22
+ | Dense (Chroma) | `vector_db.search` | ~50–100ms | HNSW + MiniLM |
23
+ | RRF Fusion | `vector_db.hybrid_search` | ~5ms | In-memory |
24
+ | **Cross-Encoder Rerank** | `src/core/reranker.py` | **~400–900ms** | **主要瓶颈** |
25
+ | Metadata Enrichment | `enrich_and_format` | ~50–100ms | SQLite lookups |
26
+
27
+ **Rerank 详情**:
28
+ - 模型: `cross-encoder/ms-marco-MiniLM-L-6-v2`
29
+ - 候选数: `max(k*4, 20)` = 50 (当 `k=10`)
30
+ - 每个 (query, doc) pair 需完整前向传播
31
+ - 50 对 × ~15–20ms/pair ≈ 750–1000ms
32
+
33
+ ---
34
+
35
+ ## Root Causes
36
+
37
+ 1. **Cross-Encoder 过重**
38
+ - 每对 (query, doc) 都要做完整 attention,无法像 Bi-Encoder 那样预计算 doc 向量
39
+ - 候选数 50 导致串行推理时间长
40
+
41
+ 2. **Benchmark 查询全部触发 Rerank**
42
+ - `TEST_QUERIES` 均为自然语言(如 "a romantic comedy set in New York")
43
+ - Router 规则: `len(words) > 2` 且无 detail 关键词 → **DEEP** → `rerank=True`
44
+ - 所以每次 benchmark 都跑 Cross-Encoder
45
+
46
+ 3. **LangGraph Agentic 模式更慢**
47
+ - Router → Retrieve → Evaluate(LLM 调用)→ 可选 Web Fallback
48
+ - 串行执行,无并行优化
49
+
50
+ ---
51
+
52
+ ## Optimization Options
53
+
54
+ ### 1. 裁剪候选集(Quick Win)
55
+
56
+ **当前**: `rerank_candidates = top_candidates[:max(k*4, 20)]` → 50 个
57
+
58
+ **建议**: 降为 20 个,或通过 config 可配置
59
+
60
+ ```python
61
+ # config.py
62
+ RERANK_CANDIDATES_MAX = 20 # 从 50 降到 20,预期 latency 减半
63
+ ```
64
+
65
+ **Trade-off**: 若 Top-20 中漏掉真实相关书,召回会略降;通常 20 足够覆盖。
66
+
67
+ ---
68
+
69
+ ### 2. ColBERT(Late Interaction)替代 Cross-Encoder
70
+
71
+ **原理**: ColBERT 对 query 和 doc 分别编码,再用 token-level MaxSim 打分,doc 向量可预计算缓存。
72
+
73
+ | 方案 | 推理方式 | 预计算 | 典型 Latency |
74
+ |------|----------|--------|--------------|
75
+ | Cross-Encoder | 每对 (q,d) 完整 forward | 否 | ~15–20ms/pair |
76
+ | ColBERT | q 编码 1 次 + doc 向量 dot | 是(doc 可缓存) | ~2–5ms/doc |
77
+
78
+ **实现要点**:
79
+ - 使用 `colbert-ai/colbertv2` 或类似库
80
+ - 预计算书籍描述的 token embeddings 存入向量库
81
+ - 在线只需 encode query + 与候选 doc 向量做 MaxSim
82
+
83
+ **Trade-off**: 需要额外索引建设和依赖,效果可能与 Cross-Encoder 相当或略逊。
84
+
85
+ ---
86
+
87
+ ### 3. Rerank 异步化
88
+
89
+ **思路**: 先返回 Hybrid RRF 的 Top-K,再后台异步 Rerank,结果通过 WebSocket/轮询或下次请求返回。
90
+
91
+ ```
92
+ 用户请求 → 立即返回 RRF Top-10 (~150ms) → 后台 Rerank → 推送精排结果(可选)
93
+ ```
94
+
95
+ **Trade-off**: 实现复杂,需改动 API 和前端;首屏结果质量略降。
96
+
97
+ ---
98
+
99
+ ### 4. ONNX 量化(已有规划)
100
+
101
+ `rerank_report.md` 已提到: 使用 Cross-Encoder 的 ONNX 版本可获约 2x 加速。
102
+
103
+ ---
104
+
105
+ ### 5. 动态 Rerank 策略(已部分实现)
106
+
107
+ Router 已对 ISBN/关键词 禁用 Rerank;可进一步收紧:
108
+ - 仅当 query 长度 > 某阈值且非纯关键词时启用
109
+ - 或增加「低延迟模式」:用户可选「快速」vs「精准」
110
+
111
+ ---
112
+
113
+ ## Implementation Status (v2.7+)
114
+
115
+ | 优化 | 状态 | 说明 |
116
+ |------|------|------|
117
+ | 1. 裁剪候选集 | ✅ | `RERANK_CANDIDATES_MAX=20` (config), env 可覆盖 |
118
+ | 2. ColBERT | ✅ | `RERANKER_BACKEND=colbert`, 需 `pip install llama-index-postprocessor-colbert-rerank` |
119
+ | 3. Rerank 异步化 | ✅ | `fast=true` 跳过 rerank; `async_rerank=true` 先返 RRF,后台精排并缓存 |
120
+ | 4. ONNX 量化 | ✅ | `RERANKER_BACKEND=onnx` (默认), 需 `onnxruntime` |
121
+
122
+ ### API 用法
123
+
124
+ ```bash
125
+ # 快速模式 (~150ms)
126
+ curl -X POST /recommend -d '{"query":"romantic comedy","fast":true}'
127
+
128
+ # 异步精排:先返 RRF,下次同 query 返缓存精排
129
+ curl -X POST /recommend -d '{"query":"romantic comedy","async_rerank":true}'
130
+ ```
docs/interview_guide.md CHANGED
@@ -98,7 +98,37 @@
98
 
99
  ## 🔬 深度技术问题 (Advanced Technical Q&A)
100
 
101
- ### Q5. 负采样 (Negative Sampling)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
102
 
103
  **问题**:你在 TECHNICAL_REPORT 中使用了 "Hard negative sampling from recall results"。这样做会不会导致 **False Negative** 问题(即把用户其实喜欢但没点击的物品当成了负样本)?在训练 DIN 或 LGBMRanker 时,你是如何平衡 Random Negatives 和 Hard Negatives 的比例的?这对模型收敛有什么影响?
104
 
@@ -114,7 +144,7 @@
114
 
115
  ---
116
 
117
- ### Q6. 实时性 (Real-time / Near-line)
118
 
119
  **问题**:SASRec 主要是离线训练的。在 Spotify 场景下,如果用户刚刚连续听了 3 首 "Heavy Metal",我们希望下一首推荐立刻跟上这个兴趣变化。在目前的架构下,如何将用户的**实时交互序列**(还没落库到 CSV)注入到 SASRec 或 DIN 的推理过程中?需要在 `RecommendationService` 里增加什么逻辑?
120
 
@@ -135,7 +165,7 @@
135
 
136
  ---
137
 
138
- ### Q7. 评估指标:Diversity 与 Serendipity
139
 
140
  **问题**:目前关注的是 HR@10 和 NDCG。作为内容平台,发现推荐列表里全是热门书(Harry Potter 效应)。如果要求在不显著降低 Accuracy 的前提下,提升推荐结果的 **Diversity(多样性)** 和 **Serendipity(惊喜感)**,你会如何在 Ranking 阶段或 Rerank 阶段修改目标函数或逻辑?
141
 
@@ -162,7 +192,7 @@
162
 
163
  ## 📋 已知限制与改进方向 (Known Limitations & Improvement)
164
 
165
- ### Q6. "Research" 风格的代码残留
166
 
167
  **现象**:代码库在向 production 演进过程中,仍保留了一些研究原型风格的痕迹。
168
 
 
98
 
99
  ## 🔬 深度技术问题 (Advanced Technical Q&A)
100
 
101
+ ### Q5. ChromaDB/SQLite 内存与扩展性:千万级迁移
102
+
103
+ **问题**:你选择了 ChromaDB (embedded) 和 SQLite。这对于演示很好,但对于千万级 Item 的库(Spotify 级别),这不可行。**如何迁移到 Milvus/Qdrant?如何对 ANN 索引(HNSW)进行分片?**
104
+
105
+ **考察点**:对向量数据库扩展性、分布式 ANN 的理解。
106
+
107
+ **建议回答**:
108
+
109
+ > 当前架构(ChromaDB + SQLite)适合 20 万级数据和演示。千万级规模下存在以下瓶颈:
110
+ >
111
+ > **ChromaDB**:嵌入式、单机、索引加载到内存。10M × 384 维 × 4B ≈ 15GB 向量,HNSW 图结构可能再放大 10–50 倍,单机内存和 CPU 无法支撑。
112
+ >
113
+ > **SQLite**:单文件、单写锁、磁盘 I/O 成为瓶颈。
114
+ >
115
+ > **迁移策略**:
116
+ >
117
+ > 1. **抽象 VectorStore 接口**:在 `vector_db.py` 中抽象 `VectorStoreInterface`,实现 `ChromaVectorStore`、`QdrantVectorStore`、`MilvusVectorStore`,通过配置切换,便于迁移。
118
+ > 2. **选型**:Milvus 适合大数据、分析 + 检索、原生分布式;Qdrant 更轻量、纯向量检索。千万级两者皆可。
119
+ > 3. **迁移步骤**:导出 Chroma 的 (id, embedding, metadata) → 在 Milvus/Qdrant 创建 Collection、配置 HNSW 参数 → 批量 upsert → 配置切换。
120
+ >
121
+ > **HNSW 分片**:
122
+ >
123
+ > - **按 ID 哈希分片**:`hash(id) % N` 分布到 N 个 shard,每 shard 内建 HNSW。查询时并发打 N 个 shard,各取 top_k,再 merge 取最终 top_k。
124
+ > - **按 embedding 聚类分片**:K-Means 聚类,query 先定位所属簇,只查少数 shard(减少查询范围,但需处理冷启动和数据倾斜)。
125
+ > - **利用 Milvus/Qdrant 内置能力**:两者都支持分布式分片,可直接使用其 Sharding 配置,无需自建。
126
+ >
127
+ > **与 Q4 的衔接**:metadata_store 的 SQLite 按 Q4 方案改造(Redis + PostgreSQL/Cassandra); sparse 检索 FTS5 可迁移到 Elasticsearch/Meilisearch 做 hybrid。
128
+
129
+ ---
130
+
131
+ ### Q6. 负采样 (Negative Sampling)
132
 
133
  **问题**:你在 TECHNICAL_REPORT 中使用了 "Hard negative sampling from recall results"。这样做会不会导致 **False Negative** 问题(即把用户其实喜欢但没点击的物品当成了负样本)?在训练 DIN 或 LGBMRanker 时,你是如何平衡 Random Negatives 和 Hard Negatives 的比例的?这对模型收敛有什么影响?
134
 
 
144
 
145
  ---
146
 
147
+ ### Q7. 实时性 (Real-time / Near-line)
148
 
149
  **问题**:SASRec 主要是离线训练的。在 Spotify 场景下,如果用户刚刚连续听了 3 首 "Heavy Metal",我们希望下一首推荐立刻跟上这个兴趣变化。在目前的架构下,如何将用户的**实时交互序列**(还没落库到 CSV)注入到 SASRec 或 DIN 的推理过程中?需要在 `RecommendationService` 里增加什么逻辑?
150
 
 
165
 
166
  ---
167
 
168
+ ### Q8. 评估指标:Diversity 与 Serendipity
169
 
170
  **问题**:目前关注的是 HR@10 和 NDCG。作为内容平台,发现推荐列表里全是热门书(Harry Potter 效应)。如果要求在不显著降低 Accuracy 的前提下,提升推荐结果的 **Diversity(多样性)** 和 **Serendipity(惊喜感)**,你会如何在 Ranking 阶段或 Rerank 阶段修改目标函数或逻辑?
171
 
 
192
 
193
  ## 📋 已知限制与改进方向 (Known Limitations & Improvement)
194
 
195
+ ### Q9. "Research" 风格的代码残留
196
 
197
  **现象**:代码库在向 production 演进过程中,仍保留了一些研究原型风格的痕迹。
198
 
requirements.txt CHANGED
@@ -24,6 +24,7 @@ langchain-openai
24
  transformers>=4.40.0
25
  torch
26
  sentence-transformers
 
27
  gensim>=4.3.0
28
  lightgbm
29
  xgboost>=2.0.0
@@ -43,6 +44,9 @@ requests
43
  # Intent classifier backends (optional)
44
  # fasttext # Uncomment for FastText backend: pip install fasttext
45
 
 
 
 
46
  # LLM Agent & Fine-tuning
47
  faiss-cpu
48
  diffusers
 
24
  transformers>=4.40.0
25
  torch
26
  sentence-transformers
27
+ onnxruntime>=1.16.0 # For CrossEncoder backend=onnx (~2x faster)
28
  gensim>=4.3.0
29
  lightgbm
30
  xgboost>=2.0.0
 
44
  # Intent classifier backends (optional)
45
  # fasttext # Uncomment for FastText backend: pip install fasttext
46
 
47
+ # Latency: ColBERT reranker (optional, RERANKER_BACKEND=colbert)
48
+ # llama-index-postprocessor-colbert-rerank
49
+
50
  # LLM Agent & Fine-tuning
51
  faiss-cpu
52
  diffusers
scripts/model/train_intent_router.py CHANGED
@@ -67,6 +67,18 @@ SEED_DATA = [
67
  ("music", "fast"),
68
  ("art", "fast"),
69
  ("philosophy", "fast"),
 
 
 
 
 
 
 
 
 
 
 
 
70
  # deep: natural language, complex queries
71
  ("What are the best books about artificial intelligence for beginners", "deep"),
72
  ("I'm looking for something similar to Harry Potter", "deep"),
@@ -87,6 +99,12 @@ SEED_DATA = [
87
  ("Recommend me novels with strong female protagonists", "deep"),
88
  ("What to read to understand economics", "deep"),
89
  ("Books on meditation and mindfulness", "deep"),
 
 
 
 
 
 
90
  ]
91
 
92
 
@@ -144,8 +162,8 @@ def main():
144
  pred = result.predict(sample)[0][0].replace("__label__", "")
145
  elif args.backend == "distilbert":
146
  from transformers import pipeline
147
- pipe = pipeline("zero-shot-classification", model="distilbert-base-uncased", device=-1)
148
- pred = pipe(sample, INTENTS, multi_label=False)["labels"][0]
149
  else:
150
  pred = result.predict([sample])[0]
151
  ok = "✓" if pred == intent else "✗"
 
67
  ("music", "fast"),
68
  ("art", "fast"),
69
  ("philosophy", "fast"),
70
+ # fast: book titles (keyword-like, BM25 works well)
71
+ ("War and Peace", "fast"),
72
+ ("The Lord of the Rings", "fast"),
73
+ ("Harry Potter", "fast"),
74
+ ("1984", "fast"),
75
+ ("To Kill a Mockingbird", "fast"),
76
+ ("The Great Gatsby", "fast"),
77
+ ("Pride and Prejudice", "fast"),
78
+ ("Dune", "fast"),
79
+ ("Sapiens", "fast"),
80
+ ("Atomic Habits", "fast"),
81
+ ("Deep Work", "fast"),
82
  # deep: natural language, complex queries
83
  ("What are the best books about artificial intelligence for beginners", "deep"),
84
  ("I'm looking for something similar to Harry Potter", "deep"),
 
99
  ("Recommend me novels with strong female protagonists", "deep"),
100
  ("What to read to understand economics", "deep"),
101
  ("Books on meditation and mindfulness", "deep"),
102
+ # deep: natural language with book references (need context, not just keyword)
103
+ ("books like War and Peace", "deep"),
104
+ ("similar to The Lord of the Rings", "deep"),
105
+ ("recommend something like Harry Potter", "deep"),
106
+ ("what to read after 1984", "deep"),
107
+ ("books similar to Sapiens", "deep"),
108
  ]
109
 
110
 
 
162
  pred = result.predict(sample)[0][0].replace("__label__", "")
163
  elif args.backend == "distilbert":
164
  from transformers import pipeline
165
+ pipe = pipeline("zero-shot-classification", model="distilbert-base-uncased", device=-1)
166
+ pred = pipe(sample, INTENTS, multi_label=False)["labels"][0]
167
  else:
168
  pred = result.predict([sample])[0]
169
  ok = "✓" if pred == intent else "✗"
src/config.py CHANGED
@@ -32,6 +32,12 @@ CACHE_TTL = 3600 # 1 hour
32
  TOP_K_INITIAL = 50
33
  TOP_K_FINAL = 10
34
 
 
 
 
 
 
 
35
  # Debug mode: set DEBUG=1 to enable verbose logging (research prototype style)
36
  DEBUG = os.getenv("DEBUG", "0") == "1"
37
 
@@ -47,6 +53,10 @@ def _load_router_config() -> dict:
47
  "new", "newest", "latest", "recent", "modern", "contemporary", "current",
48
  ],
49
  "strong_freshness_keywords": ["newest", "latest"],
 
 
 
 
50
  }
51
  path = CONFIG_DIR / "router.json"
52
  if path.exists():
@@ -82,3 +92,6 @@ ROUTER_FRESHNESS_KEYWORDS: frozenset[str] = frozenset(
82
  ROUTER_STRONG_FRESHNESS_KEYWORDS: frozenset[str] = frozenset(
83
  str(k).lower() for k in _ROUTER_CFG.get("strong_freshness_keywords", [])
84
  )
 
 
 
 
32
  TOP_K_INITIAL = 50
33
  TOP_K_FINAL = 10
34
 
35
+ # Latency: Rerank candidate cap (lower = faster, LATENCY_OPTIMIZATION.md)
36
+ RERANK_CANDIDATES_MAX = int(os.getenv("RERANK_CANDIDATES_MAX", "20"))
37
+
38
+ # Reranker backend: cross_encoder | onnx | colbert (onnx ~2x faster, colbert optional)
39
+ RERANKER_BACKEND = os.getenv("RERANKER_BACKEND", "onnx")
40
+
41
  # Debug mode: set DEBUG=1 to enable verbose logging (research prototype style)
42
  DEBUG = os.getenv("DEBUG", "0") == "1"
43
 
 
53
  "new", "newest", "latest", "recent", "modern", "contemporary", "current",
54
  ],
55
  "strong_freshness_keywords": ["newest", "latest"],
56
+ "natural_language_keywords": [
57
+ "like", "similar", "recommend", "want", "looking", "books", "something",
58
+ "suggest", "recommendations", "after", "read", "if", "liked",
59
+ ],
60
  }
61
  path = CONFIG_DIR / "router.json"
62
  if path.exists():
 
92
  ROUTER_STRONG_FRESHNESS_KEYWORDS: frozenset[str] = frozenset(
93
  str(k).lower() for k in _ROUTER_CFG.get("strong_freshness_keywords", [])
94
  )
95
+ ROUTER_NL_KEYWORDS: frozenset[str] = frozenset(
96
+ str(k).lower() for k in _ROUTER_CFG.get("natural_language_keywords", [])
97
+ )
src/core/intent_prober.py ADDED
@@ -0,0 +1,112 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ P2: Zero-shot intent probing for cold-start users.
3
+
4
+ Uses LLM to infer categories, emotions, and keywords from a user's first query.
5
+ When user has no history, this helps seed preferences for faster convergence.
6
+ """
7
+
8
+ import json
9
+ import re
10
+ from typing import Optional
11
+
12
+ from src.utils import setup_logger
13
+
14
+ logger = setup_logger(__name__)
15
+
16
+ # Categories we support (match router/metadata)
17
+ KNOWN_CATEGORIES = [
18
+ "Fiction", "History", "Philosophy", "Science", "Art",
19
+ "Biography", "Mystery", "Romance", "Fantasy", "Science Fiction",
20
+ "Literary", "General",
21
+ ]
22
+
23
+ EMOTION_KEYWORDS = [
24
+ "happy", "sad", "suspenseful", "angry", "surprising",
25
+ "heartbreaking", "uplifting", "thought-provoking", "relaxing",
26
+ ]
27
+
28
+
29
+ def probe_intent(query: str, llm=None) -> dict:
30
+ """
31
+ Infer user intent from a short query (zero-shot, no history).
32
+
33
+ Returns:
34
+ dict with keys: categories, emotions, keywords, summary
35
+ """
36
+ if not query or not query.strip():
37
+ return {"categories": [], "emotions": [], "keywords": [], "summary": ""}
38
+
39
+ if llm is None:
40
+ try:
41
+ from src.core.llm import get_llm_model
42
+ import os
43
+ provider = os.getenv("LLM_PROVIDER", "ollama")
44
+ api_key = os.getenv("OPENAI_API_KEY") if provider == "openai" else None
45
+ llm = get_llm_model(provider=provider, api_key=api_key)
46
+ except Exception as e:
47
+ logger.warning(f"Intent prober: LLM not available ({e}), using rule-based fallback")
48
+ return _rule_based_intent(query)
49
+
50
+ prompt = f"""Analyze this book preference query and return JSON only.
51
+
52
+ Query: "{query.strip()}"
53
+
54
+ Extract:
55
+ - categories: list of book categories from {KNOWN_CATEGORIES} that match (max 3)
56
+ - emotions: list of emotions/moods from {EMOTION_KEYWORDS} that match (max 2)
57
+ - keywords: 2-4 short searchable keywords (e.g. "WWII", "detective", "love story")
58
+ - summary: one short sentence summarizing what the user wants
59
+
60
+ Return only valid JSON, no markdown:
61
+ {{"categories": [...], "emotions": [...], "keywords": [...], "summary": "..."}}"""
62
+
63
+ try:
64
+ response = llm.invoke(prompt)
65
+ text = response.content if hasattr(response, "content") else str(response)
66
+ # Extract JSON from response (handle markdown code blocks)
67
+ json_match = re.search(r"\{[^{}]*\}", text, re.DOTALL)
68
+ if json_match:
69
+ data = json.loads(json_match.group())
70
+ return {
71
+ "categories": data.get("categories", [])[:3],
72
+ "emotions": data.get("emotions", [])[:2],
73
+ "keywords": data.get("keywords", [])[:4],
74
+ "summary": data.get("summary", "")[:200],
75
+ }
76
+ except Exception as e:
77
+ logger.warning(f"Intent prober LLM failed: {e}")
78
+
79
+ return _rule_based_intent(query)
80
+
81
+
82
+ def _rule_based_intent(query: str) -> dict:
83
+ """Fallback when LLM unavailable: simple keyword matching."""
84
+ lower = query.lower().strip()
85
+ categories = []
86
+ emotions = []
87
+ keywords = []
88
+
89
+ cat_map = {
90
+ "fiction": "Fiction", "history": "History", "philosophy": "Philosophy",
91
+ "science": "Science", "art": "Art", "mystery": "Mystery", "romance": "Romance",
92
+ "fantasy": "Fantasy", "sci-fi": "Science Fiction", "biography": "Biography",
93
+ }
94
+ for k, v in cat_map.items():
95
+ if re.search(r"\b" + re.escape(k) + r"\b", lower):
96
+ categories.append(v)
97
+
98
+ for e in EMOTION_KEYWORDS:
99
+ if re.search(r"\b" + re.escape(e) + r"\b", lower):
100
+ emotions.append(e)
101
+
102
+ # Extract likely keywords (words 4+ chars, not common)
103
+ stop = {"book", "books", "want", "like", "looking", "something", "that", "with", "the", "and"}
104
+ words = [w for w in re.findall(r"\b\w{4,}\b", lower) if w not in stop][:4]
105
+ keywords.extend(words)
106
+
107
+ return {
108
+ "categories": categories[:3] or ["General"],
109
+ "emotions": emotions[:2],
110
+ "keywords": keywords[:4],
111
+ "summary": query[:150] if query else "",
112
+ }
src/core/recommendation_orchestrator.py CHANGED
@@ -54,9 +54,13 @@ class RecommendationOrchestrator:
54
  tone: str = "All",
55
  user_id: str = "local",
56
  use_agentic: bool = False,
 
 
57
  ) -> List[Dict[str, Any]]:
58
  """
59
  Generate book recommendations. Async for web search fallback.
 
 
60
  """
61
  if not query or not query.strip():
62
  return []
@@ -67,17 +71,34 @@ class RecommendationOrchestrator:
67
  logger.info(f"Returning cached results for key: {cache_key}")
68
  return cached
69
 
70
- logger.info(f"Processing request: query='{query}', category='{category}', use_agentic={use_agentic}")
 
 
71
 
72
  if use_agentic:
73
  results = await self._get_recommendations_agentic(query, category)
74
  else:
75
- results = await self._get_recommendations_classic(query, category)
76
 
77
  if results:
78
  self.cache.set(cache_key, results)
 
 
 
 
 
79
  return results
80
 
 
 
 
 
 
 
 
 
 
 
81
  def get_recommendations_sync(
82
  self,
83
  query: str,
@@ -85,10 +106,12 @@ class RecommendationOrchestrator:
85
  tone: str = "All",
86
  user_id: str = "local",
87
  use_agentic: bool = False,
 
 
88
  ) -> List[Dict[str, Any]]:
89
  """Sync wrapper for scripts/CLI."""
90
  import asyncio
91
- return asyncio.run(self.get_recommendations(query, category, tone, user_id, use_agentic))
92
 
93
  async def _get_recommendations_agentic(self, query: str, category: str) -> List[Dict[str, Any]]:
94
  """LangGraph workflow: Router -> Retrieve -> Evaluate -> (optional) Web Fallback."""
@@ -103,7 +126,7 @@ class RecommendationOrchestrator:
103
  books_list = final_state.get("isbn_list", [])
104
  return enrich_and_format(books_list, category, TOP_K_FINAL, "local", metadata_store_inst=self._meta)
105
 
106
- async def _get_recommendations_classic(self, query: str, category: str) -> List[Dict[str, Any]]:
107
  """Classic Router -> Hybrid/Small-to-Big -> optional Web Fallback."""
108
  from src.core.router import QueryRouter
109
 
@@ -111,6 +134,8 @@ class RecommendationOrchestrator:
111
  decision = router.route(query)
112
  logger.info(f"Retrieval Strategy: {decision}")
113
 
 
 
114
  if decision["strategy"] == "small_to_big":
115
  recs = self.vector_db.small_to_big_search(query, k=TOP_K_INITIAL)
116
  else:
@@ -118,7 +143,7 @@ class RecommendationOrchestrator:
118
  query,
119
  k=TOP_K_INITIAL,
120
  alpha=decision.get("alpha", 0.5),
121
- rerank=decision["rerank"],
122
  temporal=decision.get("temporal", False),
123
  )
124
 
 
54
  tone: str = "All",
55
  user_id: str = "local",
56
  use_agentic: bool = False,
57
+ fast: bool = False,
58
+ async_rerank: bool = False,
59
  ) -> List[Dict[str, Any]]:
60
  """
61
  Generate book recommendations. Async for web search fallback.
62
+ fast: Skip rerank for low latency (~150ms).
63
+ async_rerank: Return RRF immediately, rerank in background; next request gets cached reranked.
64
  """
65
  if not query or not query.strip():
66
  return []
 
71
  logger.info(f"Returning cached results for key: {cache_key}")
72
  return cached
73
 
74
+ logger.info(f"Processing request: query='{query}', category='{category}', use_agentic={use_agentic}, fast={fast}, async_rerank={async_rerank}")
75
+
76
+ skip_rerank = fast or async_rerank
77
 
78
  if use_agentic:
79
  results = await self._get_recommendations_agentic(query, category)
80
  else:
81
+ results = await self._get_recommendations_classic(query, category, skip_rerank=skip_rerank)
82
 
83
  if results:
84
  self.cache.set(cache_key, results)
85
+
86
+ if async_rerank and not use_agentic and skip_rerank:
87
+ import asyncio
88
+ asyncio.create_task(self._background_rerank_and_cache(query, category, cache_key))
89
+
90
  return results
91
 
92
+ async def _background_rerank_and_cache(self, query: str, category: str, cache_key: str) -> None:
93
+ """Run full pipeline with rerank and cache for async_rerank flow."""
94
+ try:
95
+ results = await self._get_recommendations_classic(query, category, skip_rerank=False)
96
+ if results:
97
+ self.cache.set(cache_key, results)
98
+ logger.info(f"Background rerank completed for query '{query[:30]}...'")
99
+ except Exception as e:
100
+ logger.warning(f"Background rerank failed: {e}")
101
+
102
  def get_recommendations_sync(
103
  self,
104
  query: str,
 
106
  tone: str = "All",
107
  user_id: str = "local",
108
  use_agentic: bool = False,
109
+ fast: bool = False,
110
+ async_rerank: bool = False,
111
  ) -> List[Dict[str, Any]]:
112
  """Sync wrapper for scripts/CLI."""
113
  import asyncio
114
+ return asyncio.run(self.get_recommendations(query, category, tone, user_id, use_agentic, fast, async_rerank))
115
 
116
  async def _get_recommendations_agentic(self, query: str, category: str) -> List[Dict[str, Any]]:
117
  """LangGraph workflow: Router -> Retrieve -> Evaluate -> (optional) Web Fallback."""
 
126
  books_list = final_state.get("isbn_list", [])
127
  return enrich_and_format(books_list, category, TOP_K_FINAL, "local", metadata_store_inst=self._meta)
128
 
129
+ async def _get_recommendations_classic(self, query: str, category: str, skip_rerank: bool = False) -> List[Dict[str, Any]]:
130
  """Classic Router -> Hybrid/Small-to-Big -> optional Web Fallback."""
131
  from src.core.router import QueryRouter
132
 
 
134
  decision = router.route(query)
135
  logger.info(f"Retrieval Strategy: {decision}")
136
 
137
+ do_rerank = decision["rerank"] and not skip_rerank
138
+
139
  if decision["strategy"] == "small_to_big":
140
  recs = self.vector_db.small_to_big_search(query, k=TOP_K_INITIAL)
141
  else:
 
143
  query,
144
  k=TOP_K_INITIAL,
145
  alpha=decision.get("alpha", 0.5),
146
+ rerank=do_rerank,
147
  temporal=decision.get("temporal", False),
148
  )
149
 
src/core/reranker.py CHANGED
@@ -1,104 +1,145 @@
1
- from typing import List, Tuple, Dict, Any
2
- from sentence_transformers import CrossEncoder
3
- import torch
 
 
 
 
 
4
  from src.utils import setup_logger
5
 
6
  logger = setup_logger(__name__)
7
 
8
- # 轻量级重排序模型,速度快且效果不错
9
  DEFAULT_RERANKER_MODEL = "cross-encoder/ms-marco-MiniLM-L-6-v2"
10
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
  class RerankerService:
12
  """
13
- Singleton service for re-ranking documents using a Cross-Encoder.
14
- This significantly improves RAG precision by scoring the exact relevance
15
- of (query, document) pairs.
16
  """
17
  _instance = None
18
-
19
  def __new__(cls):
20
  if cls._instance is None:
21
  cls._instance = super(RerankerService, cls).__new__(cls)
22
  cls._instance.model = None
 
23
  return cls._instance
24
-
25
  def __init__(self):
26
  if self.model is None:
27
  self._load_model()
28
-
29
  def _load_model(self):
30
- try:
31
- device = "mps" if torch.backends.mps.is_available() else "cpu"
32
- logger.info(f"Loading Reranker model: {DEFAULT_RERANKER_MODEL} on {device}...")
33
- self.model = CrossEncoder(DEFAULT_RERANKER_MODEL, device=device)
34
- logger.info("Reranker model loaded.")
35
- except Exception as e:
36
- logger.error(f"Failed to load Reranker: {e}")
37
- self.model = None
38
-
39
- def rerank(self, query: str, docs: List[Dict[str, Any]], top_k: int = 5) -> List[Dict[str, Any]]:
 
 
40
  """
41
- Rerank a list of documents based on relevance to the query.
42
-
43
- Args:
44
- query: User question
45
- docs: List of dicts, each must have a 'content' field (or 'description')
46
- top_k: Number of results to return
47
-
48
- Returns:
49
- Top-K sorted documents with added 'score' field.
50
  """
51
  if not self.model or not docs:
52
  return docs[:top_k]
53
-
54
- # Prepare pairs for Cross-Encoder: [[query, doc1], [query, doc2], ...]
55
- # We assume 'description' or 'page_content' holds the text
56
- pairs = []
57
- valid_docs = []
58
-
59
- for doc in docs:
60
- # Handle LangChain Document object
61
- if hasattr(doc, "page_content"):
62
- text = doc.page_content
63
- # Handle Dict
64
- else:
65
- text = doc.get("description") or doc.get("page_content") or str(doc)
66
-
67
- pairs.append([query, text])
68
- valid_docs.append(doc)
69
-
70
- if not pairs:
71
- return docs[:top_k]
72
 
73
- # Predict scores
 
 
 
 
 
74
  scores = self.model.predict(pairs)
75
-
76
- # Attach scores and sort
77
- scored_results = []
78
- for i, doc in enumerate(valid_docs):
79
- score = float(scores[i])
80
- if hasattr(doc, "metadata"):
81
- # Handle Document
82
- # Create a shallow copy to avoid mutating original if needed,
83
- # but simplistic approach is fine here
84
- doc.metadata["relevance_score"] = score
85
- scored_results.append(doc)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
86
  else:
87
- # Handle Dict
88
- doc_copy = doc.copy()
89
- doc_copy["score"] = score
90
- scored_results.append(doc_copy)
91
-
92
- # Sort descending by score
93
- # Sort descending by score
94
- def get_score(doc):
95
- if hasattr(doc, "metadata"):
96
- return doc.metadata.get("relevance_score", 0)
97
- return doc.get("score", 0)
98
-
99
- scored_results.sort(key=get_score, reverse=True)
100
-
101
- return scored_results[:top_k]
102
-
103
- # Global instance
104
  reranker = RerankerService()
 
1
+ """
2
+ Reranker: Cross-Encoder (torch/ONNX) or ColBERT (optional).
3
+ Backend selectable via RERANKER_BACKEND env: cross_encoder | onnx | colbert.
4
+ ONNX ~2x faster than torch; ColBERT requires llama-index-postprocessor-colbert-rerank.
5
+ """
6
+ from typing import List, Dict, Any
7
+
8
+ from src.config import RERANKER_BACKEND
9
  from src.utils import setup_logger
10
 
11
  logger = setup_logger(__name__)
12
 
 
13
  DEFAULT_RERANKER_MODEL = "cross-encoder/ms-marco-MiniLM-L-6-v2"
14
 
15
+
16
+ def _load_cross_encoder(backend: str):
17
+ """Load CrossEncoder with torch or ONNX backend. Falls back to torch if ONNX fails."""
18
+ from sentence_transformers import CrossEncoder
19
+ import torch
20
+
21
+ device = "mps" if torch.backends.mps.is_available() else "cpu"
22
+ be = "onnx" if backend == "onnx" else "torch"
23
+
24
+ try:
25
+ logger.info(f"Loading Reranker ({DEFAULT_RERANKER_MODEL}) backend={be} on {device}...")
26
+ model = CrossEncoder(DEFAULT_RERANKER_MODEL, device=device, backend=be)
27
+ logger.info("Reranker model loaded.")
28
+ return model
29
+ except Exception as e:
30
+ if be == "onnx":
31
+ logger.warning(f"ONNX backend failed (pip install onnxruntime?), falling back to torch: {e}")
32
+ return CrossEncoder(DEFAULT_RERANKER_MODEL, device=device, backend="torch")
33
+ raise
34
+
35
+
36
+ def _load_colbert():
37
+ """Load ColBERT reranker via llama-index (optional dep)."""
38
+ try:
39
+ from llama_index.postprocessor.colbert_rerank import ColbertRerank
40
+
41
+ return ColbertRerank(
42
+ model_name="colbert-ir/colbertv2.0",
43
+ top_n=10,
44
+ )
45
+ except ImportError as e:
46
+ logger.warning(f"ColBERT not available (pip install llama-index-postprocessor-colbert-rerank): {e}")
47
+ return None
48
+
49
+
50
+ def _get_text(doc: Any) -> str:
51
+ if hasattr(doc, "page_content"):
52
+ return doc.page_content
53
+ return doc.get("description") or doc.get("page_content") or str(doc)
54
+
55
+
56
+ def _set_score(doc: Any, score: float) -> None:
57
+ if hasattr(doc, "metadata"):
58
+ doc.metadata["relevance_score"] = score
59
+ else:
60
+ doc["score"] = score
61
+
62
+
63
+ def _get_score(doc: Any) -> float:
64
+ if hasattr(doc, "metadata"):
65
+ return doc.metadata.get("relevance_score", 0)
66
+ return doc.get("score", 0)
67
+
68
+
69
  class RerankerService:
70
  """
71
+ Singleton reranker: Cross-Encoder (torch/ONNX) or ColBERT.
 
 
72
  """
73
  _instance = None
74
+
75
  def __new__(cls):
76
  if cls._instance is None:
77
  cls._instance = super(RerankerService, cls).__new__(cls)
78
  cls._instance.model = None
79
+ cls._instance._backend = None
80
  return cls._instance
81
+
82
  def __init__(self):
83
  if self.model is None:
84
  self._load_model()
85
+
86
  def _load_model(self):
87
+ backend = (RERANKER_BACKEND or "").lower()
88
+
89
+ if backend == "colbert":
90
+ self.model = _load_colbert()
91
+ self._backend = "colbert" if self.model else "cross_encoder"
92
+ if self._backend == "cross_encoder":
93
+ self.model = _load_cross_encoder("torch")
94
+ else:
95
+ self._backend = "onnx" if backend == "onnx" else "cross_encoder"
96
+ self.model = _load_cross_encoder(self._backend)
97
+
98
+ def rerank(self, query: str, docs: List[Any], top_k: int = 5) -> List[Any]:
99
  """
100
+ Rerank documents by relevance to query.
101
+ docs: List of dicts or LangChain Document with description/page_content.
 
 
 
 
 
 
 
102
  """
103
  if not self.model or not docs:
104
  return docs[:top_k]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
105
 
106
+ if self._backend == "colbert":
107
+ return self._rerank_colbert(query, docs, top_k)
108
+ return self._rerank_cross_encoder(query, docs, top_k)
109
+
110
+ def _rerank_cross_encoder(self, query: str, docs: List[Any], top_k: int) -> List[Any]:
111
+ pairs = [[query, _get_text(d)] for d in docs]
112
  scores = self.model.predict(pairs)
113
+
114
+ for i, doc in enumerate(docs):
115
+ _set_score(doc, float(scores[i]))
116
+
117
+ docs.sort(key=_get_score, reverse=True)
118
+ return docs[:top_k]
119
+
120
+ def _rerank_colbert(self, query: str, docs: List[Any], top_k: int) -> List[Any]:
121
+ from llama_index.schema import NodeWithScore, TextNode
122
+
123
+ # Keep ref to original doc for metadata (isbn, etc.)
124
+ nodes = []
125
+ for d in docs:
126
+ meta = d.metadata if hasattr(d, "metadata") else (d if isinstance(d, dict) else {})
127
+ node = TextNode(text=_get_text(d), metadata={"__original": d})
128
+ nodes.append(NodeWithScore(node=node, score=0.0))
129
+
130
+ reranked = self.model.postprocess_nodes(nodes, query_str=query)
131
+
132
+ result = []
133
+ for nws in reranked[:top_k]:
134
+ orig = getattr(nws.node, "metadata", {}).get("__original")
135
+ if orig is not None:
136
+ _set_score(orig, float(nws.score or 0))
137
+ result.append(orig)
138
  else:
139
+ from langchain_core.documents import Document
140
+ doc = Document(page_content=nws.node.text, metadata={"relevance_score": float(nws.score or 0)})
141
+ result.append(doc)
142
+ return result
143
+
144
+
 
 
 
 
 
 
 
 
 
 
 
145
  reranker = RerankerService()
src/core/router.py CHANGED
@@ -90,8 +90,12 @@ class QueryRouter:
90
  freshness_fallback: bool = False,
91
  target_year: Optional[int] = None
92
  ) -> Dict[str, Any]:
93
- """Fallback: rule-based routing (original logic + freshness)."""
94
- from src.config import ROUTER_DETAIL_KEYWORDS
 
 
 
 
95
 
96
  base_result = {
97
  "temporal": is_temporal,
@@ -103,10 +107,15 @@ class QueryRouter:
103
  if any(w.lower() in ROUTER_DETAIL_KEYWORDS for w in words):
104
  logger.info("Router (rules): Detail Query -> SMALL_TO_BIG")
105
  return {**base_result, "strategy": "small_to_big", "alpha": 0.5, "rerank": False, "k_final": 5}
106
- if len(words) <= 2:
107
- logger.info("Router (rules): Keyword -> FAST (Temporal=%s, Freshness=%s)", is_temporal, freshness_fallback)
 
 
 
 
 
108
  return {**base_result, "strategy": "fast", "alpha": 0.5, "rerank": False, "k_final": 5}
109
- logger.info("Router (rules): Natural Language -> DEEP (Temporal=%s, Freshness=%s)", is_temporal, freshness_fallback)
110
  return {**base_result, "strategy": "deep", "alpha": 0.5, "rerank": True, "k_final": 10}
111
 
112
  def route(self, query: str) -> Dict[str, Any]:
 
90
  freshness_fallback: bool = False,
91
  target_year: Optional[int] = None
92
  ) -> Dict[str, Any]:
93
+ """
94
+ Fallback: rule-based routing when classifier not loaded.
95
+ Uses NL keywords (like, similar, recommend...) instead of brittle word-count.
96
+ Book titles (e.g. "War and Peace", "The Lord of the Rings") -> FAST.
97
+ """
98
+ from src.config import ROUTER_DETAIL_KEYWORDS, ROUTER_NL_KEYWORDS
99
 
100
  base_result = {
101
  "temporal": is_temporal,
 
107
  if any(w.lower() in ROUTER_DETAIL_KEYWORDS for w in words):
108
  logger.info("Router (rules): Detail Query -> SMALL_TO_BIG")
109
  return {**base_result, "strategy": "small_to_big", "alpha": 0.5, "rerank": False, "k_final": 5}
110
+ # NL keywords indicate recommendation intent -> DEEP
111
+ if any(w.lower() in ROUTER_NL_KEYWORDS for w in words):
112
+ logger.info("Router (rules): NL keywords -> DEEP (Temporal=%s, Freshness=%s)", is_temporal, freshness_fallback)
113
+ return {**base_result, "strategy": "deep", "alpha": 0.5, "rerank": True, "k_final": 10}
114
+ # Short query without NL keywords: book title or keyword -> FAST
115
+ if len(words) <= 6:
116
+ logger.info("Router (rules): Keyword/Title -> FAST (Temporal=%s, Freshness=%s)", is_temporal, freshness_fallback)
117
  return {**base_result, "strategy": "fast", "alpha": 0.5, "rerank": False, "k_final": 5}
118
+ logger.info("Router (rules): Long query -> DEEP (Temporal=%s, Freshness=%s)", is_temporal, freshness_fallback)
119
  return {**base_result, "strategy": "deep", "alpha": 0.5, "rerank": True, "k_final": 10}
120
 
121
  def route(self, query: str) -> Dict[str, Any]:
src/main.py CHANGED
@@ -99,6 +99,8 @@ class RecommendationRequest(BaseModel):
99
  category: str = "All"
100
  user_id: Optional[str] = "local"
101
  use_agentic: Optional[bool] = False # LangGraph workflow: Router -> Retrieve -> Evaluate -> Web Fallback
 
 
102
 
103
 
104
  class FeatureContribution(BaseModel):
@@ -187,6 +189,8 @@ async def get_recommendations(request: RecommendationRequest):
187
  category=request.category,
188
  user_id=request.user_id if hasattr(request, 'user_id') else "local",
189
  use_agentic=request.use_agentic or False,
 
 
190
  )
191
  return {"recommendations": results}
192
  except Exception as e:
@@ -349,22 +353,58 @@ async def run_benchmark():
349
  # --- Personalized Recommendation API ---
350
 
351
  @app.get("/api/recommend/personal", response_model=RecommendationResponse)
352
- def personalized_recommendations(user_id: str = "local", top_k: int = 10):
 
 
 
 
 
 
353
  """
354
  Get personalized recommendations for a user.
355
  Uses 6-channel recall (ItemCF/UserCF/Swing/SASRec/YoutubeDNN/Popularity) + LGBMRanker.
 
 
 
 
 
356
  """
357
- # Demo logic: Map 'local' to a real user for demonstration
358
- if user_id in ["local", "demo"]:
359
- # Pick a demo user ID from active users (A1ZQ1LUQ9R6JHZ is a heavy reader)
360
- user_id = "A1ZQ1LUQ9R6JHZ"
361
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
362
  # Check initialization
363
  if not rec_service:
364
  raise HTTPException(status_code=503, detail="Service not ready")
365
-
366
  try:
367
- recs = rec_service.get_recommendations(user_id, top_k)
 
 
368
 
369
  # Enrich with metadata
370
  from src.utils import enrich_book_metadata
@@ -430,6 +470,51 @@ def personalized_recommendations(user_id: str = "local", top_k: int = 10):
430
  # In production, maybe return fallback popular items instead of error
431
  raise HTTPException(status_code=500, detail=str(e))
432
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
433
  # Allow local frontend dev origins
434
  # Added LAST so it wraps the app outermost (first to process request)
435
  app.add_middleware(
 
99
  category: str = "All"
100
  user_id: Optional[str] = "local"
101
  use_agentic: Optional[bool] = False # LangGraph workflow: Router -> Retrieve -> Evaluate -> Web Fallback
102
+ fast: Optional[bool] = False # Skip rerank for ~150ms latency
103
+ async_rerank: Optional[bool] = False # Return RRF first, rerank in background; next request gets cached
104
 
105
 
106
  class FeatureContribution(BaseModel):
 
189
  category=request.category,
190
  user_id=request.user_id if hasattr(request, 'user_id') else "local",
191
  use_agentic=request.use_agentic or False,
192
+ fast=request.fast or False,
193
+ async_rerank=request.async_rerank or False,
194
  )
195
  return {"recommendations": results}
196
  except Exception as e:
 
353
  # --- Personalized Recommendation API ---
354
 
355
  @app.get("/api/recommend/personal", response_model=RecommendationResponse)
356
+ def personalized_recommendations(
357
+ user_id: str = "local",
358
+ top_k: int = 10,
359
+ limit: Optional[int] = None,
360
+ recent_isbns: Optional[str] = None,
361
+ intent_query: Optional[str] = None,
362
+ ):
363
  """
364
  Get personalized recommendations for a user.
365
  Uses 6-channel recall (ItemCF/UserCF/Swing/SASRec/YoutubeDNN/Popularity) + LGBMRanker.
366
+
367
+ P0: recent_isbns — Comma-separated ISBNs from current session (e.g. just-viewed).
368
+ Injected into SASRec for cold-start convergence (1+ clicks).
369
+ P2: intent_query — Zero-shot intent probing when user has no history.
370
+ Probes LLM for categories/keywords, does semantic search, seeds SASRec.
371
  """
372
+ k = limit if limit is not None else top_k
373
+ # Demo logic: Map 'local' to a real user for demonstration (skip when intent_query = cold-start)
374
+ if user_id in ["local", "demo"] and not intent_query:
375
+ user_id = "A1ZQ1LUQ9R6JHZ"
376
+
377
+ # P0: Parse recent_isbns for real-time cold-start
378
+ real_time_seq = None
379
+ if recent_isbns:
380
+ real_time_seq = [x.strip() for x in recent_isbns.split(",") if x.strip()]
381
+
382
+ # P2: Zero-shot intent probing — when no recent_isbns, use query to seed
383
+ if not real_time_seq and intent_query and intent_query.strip():
384
+ from src.core.intent_prober import probe_intent
385
+ intent = probe_intent(intent_query.strip())
386
+ semantic_query = " ".join(
387
+ intent.get("keywords", []) + intent.get("categories", []) + [intent.get("summary", "")]
388
+ ).strip()
389
+ if semantic_query and recommender:
390
+ try:
391
+ rag_results = recommender.get_recommendations_sync(
392
+ semantic_query, category="All", tone="All", user_id=user_id
393
+ )
394
+ seed_isbns = [r.get("isbn") for r in (rag_results or [])[:5] if r.get("isbn")]
395
+ if seed_isbns:
396
+ real_time_seq = seed_isbns
397
+ except Exception as e:
398
+ logger.warning(f"Intent-to-seed failed: {e}")
399
+
400
  # Check initialization
401
  if not rec_service:
402
  raise HTTPException(status_code=503, detail="Service not ready")
403
+
404
  try:
405
+ recs = rec_service.get_recommendations(
406
+ user_id, top_k=k, real_time_sequence=real_time_seq
407
+ )
408
 
409
  # Enrich with metadata
410
  from src.utils import enrich_book_metadata
 
470
  # In production, maybe return fallback popular items instead of error
471
  raise HTTPException(status_code=500, detail=str(e))
472
 
473
+
474
+ @app.get("/api/intent/probe")
475
+ def probe_intent_endpoint(query: str = ""):
476
+ """
477
+ P2: Zero-shot intent probing for cold-start users.
478
+ Returns inferred categories, emotions, keywords from user's first query.
479
+ """
480
+ from src.core.intent_prober import probe_intent
481
+ try:
482
+ result = probe_intent(query)
483
+ return result
484
+ except Exception as e:
485
+ logger.error(f"Intent probe failed: {e}")
486
+ raise HTTPException(status_code=500, detail=str(e))
487
+
488
+
489
+ @app.get("/api/onboarding/books")
490
+ def get_onboarding_books(limit: int = 24):
491
+ """
492
+ P2: Return popular books for new-user onboarding.
493
+ Lets user pick 3–5 to seed preferences (cold-start).
494
+ """
495
+ if not rec_service:
496
+ raise HTTPException(status_code=503, detail="Service not ready")
497
+ try:
498
+ items = rec_service.get_popular_books(limit)
499
+ from src.utils import enrich_book_metadata
500
+ results = []
501
+ for isbn, meta in items:
502
+ meta = meta or {}
503
+ meta = enrich_book_metadata(meta, str(isbn))
504
+ results.append({
505
+ "isbn": isbn,
506
+ "title": meta.get("title") or f"ISBN: {isbn}",
507
+ "authors": meta.get("authors", "Unknown"),
508
+ "description": meta.get("description", ""),
509
+ "thumbnail": meta.get("thumbnail") or "/content/cover-not-found.jpg",
510
+ "category": meta.get("category", "General"),
511
+ })
512
+ return {"books": results}
513
+ except Exception as e:
514
+ logger.error(f"Error in onboarding books: {e}")
515
+ raise HTTPException(status_code=500, detail=str(e))
516
+
517
+
518
  # Allow local frontend dev origins
519
  # Added LAST so it wraps the app outermost (first to process request)
520
  app.add_middleware(
src/recall/fusion.py CHANGED
@@ -20,7 +20,7 @@ DEFAULT_CHANNEL_CONFIG = {
20
  "usercf": {"enabled": False, "weight": 1.0},
21
  "swing": {"enabled": False, "weight": 1.0},
22
  "item2vec": {"enabled": False, "weight": 0.8},
23
- "popularity": {"enabled": False, "weight": 0.5},
24
  }
25
 
26
 
@@ -123,6 +123,10 @@ class RecallFusion:
123
  self._add_to_candidates(candidates, recs, cfg["popularity"]["weight"])
124
 
125
  sorted_cands = sorted(candidates.items(), key=lambda x: x[1], reverse=True)
 
 
 
 
126
  return sorted_cands[:k]
127
 
128
  def _add_to_candidates(self, candidates, recs, weight: float) -> None:
 
20
  "usercf": {"enabled": False, "weight": 1.0},
21
  "swing": {"enabled": False, "weight": 1.0},
22
  "item2vec": {"enabled": False, "weight": 0.8},
23
+ "popularity": {"enabled": True, "weight": 0.5}, # P0: Cold-start fallback
24
  }
25
 
26
 
 
123
  self._add_to_candidates(candidates, recs, cfg["popularity"]["weight"])
124
 
125
  sorted_cands = sorted(candidates.items(), key=lambda x: x[1], reverse=True)
126
+ # P0: Cold-start fallback — when all channels return empty, use popularity
127
+ if not sorted_cands:
128
+ pop_recs = self.popularity.recommend(user_id, top_k=k)
129
+ sorted_cands = [(item, s) for item, s in pop_recs]
130
  return sorted_cands[:k]
131
 
132
  def _add_to_candidates(self, candidates, recs, weight: float) -> None:
src/recommender.py CHANGED
@@ -39,9 +39,11 @@ class BookRecommender:
39
  tone: str = "All",
40
  user_id: str = "local",
41
  use_agentic: bool = False,
 
 
42
  ) -> List[Dict[str, Any]]:
43
  return await self._orchestrator.get_recommendations(
44
- query, category, tone, user_id, use_agentic
45
  )
46
 
47
  def get_recommendations_sync(
@@ -51,9 +53,11 @@ class BookRecommender:
51
  tone: str = "All",
52
  user_id: str = "local",
53
  use_agentic: bool = False,
 
 
54
  ) -> List[Dict[str, Any]]:
55
  return self._orchestrator.get_recommendations_sync(
56
- query, category, tone, user_id, use_agentic
57
  )
58
 
59
  def get_similar_books(
 
39
  tone: str = "All",
40
  user_id: str = "local",
41
  use_agentic: bool = False,
42
+ fast: bool = False,
43
+ async_rerank: bool = False,
44
  ) -> List[Dict[str, Any]]:
45
  return await self._orchestrator.get_recommendations(
46
+ query, category, tone, user_id, use_agentic, fast, async_rerank
47
  )
48
 
49
  def get_recommendations_sync(
 
53
  tone: str = "All",
54
  user_id: str = "local",
55
  use_agentic: bool = False,
56
+ fast: bool = False,
57
+ async_rerank: bool = False,
58
  ) -> List[Dict[str, Any]]:
59
  return self._orchestrator.get_recommendations_sync(
60
+ query, category, tone, user_id, use_agentic, fast, async_rerank
61
  )
62
 
63
  def get_similar_books(
src/services/recommend_service.py CHANGED
@@ -155,6 +155,10 @@ class RecommendationService:
155
  candidates = self.fusion.get_recall_items(
156
  user_id, k=200, real_time_seq=real_time_sequence
157
  )
 
 
 
 
158
  if not candidates:
159
  return []
160
 
@@ -267,6 +271,27 @@ class RecommendationService:
267
 
268
  return unique_results
269
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
270
  if __name__ == "__main__":
271
  import logging
272
  logger.setLevel(logging.INFO)
 
155
  candidates = self.fusion.get_recall_items(
156
  user_id, k=200, real_time_seq=real_time_sequence
157
  )
158
+ # P1: Cold-start fallback — when recall returns empty, use popularity
159
+ if not candidates:
160
+ pop_recs = self.fusion.popularity.recommend(user_id, top_k=200)
161
+ candidates = list(pop_recs)
162
  if not candidates:
163
  return []
164
 
 
271
 
272
  return unique_results
273
 
274
+ def get_popular_books(self, limit: int = 24) -> list:
275
+ """
276
+ P2: Return popular books for onboarding selection.
277
+ Used when new user has no history — lets them pick 3–5 to seed preferences.
278
+ """
279
+ self.load_resources()
280
+ recs = self.fusion.popularity.recommend(user_id=None, top_k=limit)
281
+ results = []
282
+ seen_titles = set()
283
+ for isbn, _ in recs:
284
+ meta = self.metadata_store.get_book_metadata(str(isbn))
285
+ title = (meta.get("title") or "").lower().strip()
286
+ if title and title in seen_titles:
287
+ continue
288
+ if title:
289
+ seen_titles.add(title)
290
+ results.append((isbn, meta or {}))
291
+ if len(results) >= limit:
292
+ break
293
+ return results
294
+
295
  if __name__ == "__main__":
296
  import logging
297
  logger.setLevel(logging.INFO)
src/vector_db.py CHANGED
@@ -2,7 +2,7 @@ from typing import List, Any
2
  # Using community version to avoid 'BaseBlobParser' version conflict in langchain-chroma/core
3
  from langchain_community.vectorstores import Chroma
4
  from langchain_huggingface import HuggingFaceEmbeddings
5
- from src.config import REVIEW_HIGHLIGHTS_TXT, CHROMA_DB_DIR, EMBEDDING_MODEL
6
  from src.utils import setup_logger
7
  from src.core.metadata_store import metadata_store
8
  from src.core.online_books_store import online_books_store
@@ -220,8 +220,7 @@ class VectorDB:
220
  final_results = top_candidates[:k]
221
  if rerank:
222
  from src.core.reranker import reranker
223
- # Rerank the top 20 (or more) candidates from fusion
224
- rerank_candidates = top_candidates[:max(k*4, 20)]
225
  logger.info(f"Reranking top {len(rerank_candidates)} candidates...")
226
  final_results = reranker.rerank(query, rerank_candidates, top_k=k)
227
 
 
2
  # Using community version to avoid 'BaseBlobParser' version conflict in langchain-chroma/core
3
  from langchain_community.vectorstores import Chroma
4
  from langchain_huggingface import HuggingFaceEmbeddings
5
+ from src.config import REVIEW_HIGHLIGHTS_TXT, CHROMA_DB_DIR, EMBEDDING_MODEL, RERANK_CANDIDATES_MAX
6
  from src.utils import setup_logger
7
  from src.core.metadata_store import metadata_store
8
  from src.core.online_books_store import online_books_store
 
220
  final_results = top_candidates[:k]
221
  if rerank:
222
  from src.core.reranker import reranker
223
+ rerank_candidates = top_candidates[:min(len(top_candidates), RERANK_CANDIDATES_MAX)]
 
224
  logger.info(f"Reranking top {len(rerank_candidates)} candidates...")
225
  final_results = reranker.rerank(query, rerank_candidates, top_k=k)
226
 
web/src/App.jsx CHANGED
@@ -19,6 +19,7 @@ import Header from "./components/Header";
19
  import BookDetailModal from "./components/BookDetailModal";
20
  import SettingsModal from "./components/SettingsModal";
21
  import AddBookModal from "./components/AddBookModal";
 
22
 
23
  // Pages
24
  import GalleryPage from "./pages/GalleryPage";
@@ -57,6 +58,13 @@ const App = () => {
57
  return stored === "mock" || !stored ? "ollama" : stored;
58
  });
59
 
 
 
 
 
 
 
 
60
  // --- Add Book Modal State ---
61
  const [showAddBook, setShowAddBook] = useState(false);
62
  const [googleQuery, setGoogleQuery] = useState("");
@@ -64,6 +72,14 @@ const App = () => {
64
  const [isSearching, setIsSearching] = useState(false);
65
  const [addingBookId, setAddingBookId] = useState(null);
66
 
 
 
 
 
 
 
 
 
67
  // --- Load favorites and stats on startup or user change ---
68
  useEffect(() => {
69
  setLoading(true);
@@ -78,11 +94,13 @@ const App = () => {
78
  reading: 0,
79
  finished: 0,
80
  })),
81
- getPersonalizedRecommendations(userId).catch(() => []),
82
  ]).then(([favs, stats, personalRecs]) => {
83
  setMyCollection(favs);
84
  setReadingStats(stats);
85
-
 
 
86
  const mappedRecs = personalRecs.map((r, idx) => ({
87
  id: r.isbn,
88
  title: r.title,
@@ -283,6 +301,13 @@ const App = () => {
283
  };
284
 
285
  const openBook = (book) => {
 
 
 
 
 
 
 
286
  setSelectedBook({
287
  ...book,
288
  aiHighlight: "\u2728 ...",
@@ -319,8 +344,15 @@ const App = () => {
319
  setBooks([]);
320
  try {
321
  let recs;
322
- if (!searchQuery) {
323
- recs = await getPersonalizedRecommendations(userId);
 
 
 
 
 
 
 
324
  } else {
325
  recs = await recommend(searchQuery, searchCategory, searchMood, userId);
326
  }
@@ -384,6 +416,44 @@ const App = () => {
384
  />
385
  )}
386
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
387
  {showAddBook && (
388
  <AddBookModal
389
  onClose={() => setShowAddBook(false)}
 
19
  import BookDetailModal from "./components/BookDetailModal";
20
  import SettingsModal from "./components/SettingsModal";
21
  import AddBookModal from "./components/AddBookModal";
22
+ import OnboardingModal from "./components/OnboardingModal";
23
 
24
  // Pages
25
  import GalleryPage from "./pages/GalleryPage";
 
58
  return stored === "mock" || !stored ? "ollama" : stored;
59
  });
60
 
61
+ // --- P1: Session-level recent ISBNs for cold-start ---
62
+ const [recentIsbns, setRecentIsbns] = useState([]);
63
+ const MAX_RECENT_ISBNS = 10;
64
+
65
+ // --- P2: Onboarding (new user, no collection) ---
66
+ const [showOnboarding, setShowOnboarding] = useState(false);
67
+
68
  // --- Add Book Modal State ---
69
  const [showAddBook, setShowAddBook] = useState(false);
70
  const [googleQuery, setGoogleQuery] = useState("");
 
72
  const [isSearching, setIsSearching] = useState(false);
73
  const [addingBookId, setAddingBookId] = useState(null);
74
 
75
+ // --- P2: Show onboarding when new user (no collection, not completed) ---
76
+ useEffect(() => {
77
+ const completed = localStorage.getItem("onboarding_complete") === "true";
78
+ if (!completed && userId === "local") {
79
+ setShowOnboarding(true);
80
+ }
81
+ }, [userId]);
82
+
83
  // --- Load favorites and stats on startup or user change ---
84
  useEffect(() => {
85
  setLoading(true);
 
94
  reading: 0,
95
  finished: 0,
96
  })),
97
+ getPersonalizedRecommendations(userId, 20, recentIsbns).catch(() => []),
98
  ]).then(([favs, stats, personalRecs]) => {
99
  setMyCollection(favs);
100
  setReadingStats(stats);
101
+ if (favs.length > 0) {
102
+ localStorage.setItem("onboarding_complete", "true");
103
+ }
104
  const mappedRecs = personalRecs.map((r, idx) => ({
105
  id: r.isbn,
106
  title: r.title,
 
301
  };
302
 
303
  const openBook = (book) => {
304
+ // P1: Track session-level recent views for cold-start
305
+ if (book?.isbn) {
306
+ setRecentIsbns((prev) => {
307
+ const next = [book.isbn, ...prev.filter((i) => i !== book.isbn)].slice(0, MAX_RECENT_ISBNS);
308
+ return next;
309
+ });
310
+ }
311
  setSelectedBook({
312
  ...book,
313
  aiHighlight: "\u2728 ...",
 
344
  setBooks([]);
345
  try {
346
  let recs;
347
+ // P2: Cold-start with intent — when no collection and user typed a mood, use intent-seeded personal recs
348
+ const useIntentSeed = myCollection.length === 0 && searchQuery.trim();
349
+ if (!searchQuery || useIntentSeed) {
350
+ recs = await getPersonalizedRecommendations(
351
+ userId,
352
+ 20,
353
+ recentIsbns,
354
+ useIntentSeed ? searchQuery : null
355
+ );
356
  } else {
357
  recs = await recommend(searchQuery, searchCategory, searchMood, userId);
358
  }
 
416
  />
417
  )}
418
 
419
+ {showOnboarding && (
420
+ <OnboardingModal
421
+ onComplete={async () => {
422
+ setShowOnboarding(false);
423
+ const [favs, stats, personalRecs] = await Promise.all([
424
+ getFavorites(userId).catch(() => []),
425
+ getUserStats(userId).catch(() => ({ total: 0, want_to_read: 0, reading: 0, finished: 0 })),
426
+ getPersonalizedRecommendations(userId, 20, recentIsbns).catch(() => []),
427
+ ]);
428
+ setMyCollection(favs);
429
+ setReadingStats(stats);
430
+ const mapped = (personalRecs || []).map((r, idx) => ({
431
+ id: r.isbn,
432
+ title: r.title,
433
+ author: r.authors,
434
+ category: r.category || "General",
435
+ mood: r.emotions && Object.keys(r.emotions).length > 0
436
+ ? Object.entries(r.emotions).reduce((a, b) => (a[1] > b[1] ? a : b))[0]
437
+ : "Literary",
438
+ rank: idx + 1,
439
+ rating: r.average_rating || 0,
440
+ tags: r.tags || [],
441
+ review_highlights: r.review_highlights || [],
442
+ desc: r.description,
443
+ img: r.thumbnail,
444
+ isbn: r.isbn,
445
+ emotions: r.emotions || {},
446
+ explanations: r.explanations || [],
447
+ aiHighlight: "\u2014",
448
+ suggestedQuestions: ["Why was this recommended?", "Similar to what I've read?", "What's the core highlight?"],
449
+ }));
450
+ setBooks(mapped);
451
+ }}
452
+ onAddFavorite={(isbn) => addFavorite(isbn, userId)}
453
+ onSkip={() => setShowOnboarding(false)}
454
+ />
455
+ )}
456
+
457
  {showAddBook && (
458
  <AddBookModal
459
  onClose={() => setShowAddBook(false)}
web/src/api.js CHANGED
@@ -1,7 +1,7 @@
1
  const API_URL = import.meta.env.VITE_API_URL || (import.meta.env.PROD ? "" : "http://127.0.0.1:6006");
2
 
3
- export async function recommend(query, category = "All", tone = "All", user_id = "local", use_agentic = false) {
4
- const body = { query, category, tone, user_id, use_agentic };
5
  const resp = await fetch(`${API_URL}/recommend`, {
6
  method: "POST",
7
  headers: { "Content-Type": "application/json" },
@@ -12,9 +12,23 @@ export async function recommend(query, category = "All", tone = "All", user_id =
12
  return data.recommendations || [];
13
  }
14
 
15
- export async function getPersonalizedRecommendations(user_id = "local", limit = 20) {
16
- // Use URLSearchParams for query parameters
 
 
 
 
 
 
 
 
17
  const params = new URLSearchParams({ user_id, limit: limit.toString() });
 
 
 
 
 
 
18
  const resp = await fetch(`${API_URL}/api/recommend/personal?${params.toString()}`);
19
  if (!resp.ok) throw new Error(await resp.text());
20
  const data = await resp.json();
 
1
  const API_URL = import.meta.env.VITE_API_URL || (import.meta.env.PROD ? "" : "http://127.0.0.1:6006");
2
 
3
+ export async function recommend(query, category = "All", tone = "All", user_id = "local", use_agentic = false, fast = false, async_rerank = false) {
4
+ const body = { query, category, tone, user_id, use_agentic, fast, async_rerank };
5
  const resp = await fetch(`${API_URL}/recommend`, {
6
  method: "POST",
7
  headers: { "Content-Type": "application/json" },
 
12
  return data.recommendations || [];
13
  }
14
 
15
+ export async function getOnboardingBooks(limit = 24) {
16
+ const resp = await fetch(`${API_URL}/api/onboarding/books?limit=${limit}`);
17
+ if (!resp.ok) throw new Error(await resp.text());
18
+ const data = await resp.json();
19
+ return data.books || [];
20
+ }
21
+
22
+ export async function getPersonalizedRecommendations(user_id = "local", limit = 20, recent_isbns = null, intent_query = null) {
23
+ // P1: recent_isbns — session-level ISBNs for cold-start (1+ clicks)
24
+ // P2: intent_query — zero-shot intent probing when user has no history
25
  const params = new URLSearchParams({ user_id, limit: limit.toString() });
26
+ if (recent_isbns && Array.isArray(recent_isbns) && recent_isbns.length > 0) {
27
+ params.set("recent_isbns", recent_isbns.join(","));
28
+ }
29
+ if (intent_query && typeof intent_query === "string" && intent_query.trim()) {
30
+ params.set("intent_query", intent_query.trim());
31
+ }
32
  const resp = await fetch(`${API_URL}/api/recommend/personal?${params.toString()}`);
33
  if (!resp.ok) throw new Error(await resp.text());
34
  const data = await resp.json();
web/src/components/OnboardingModal.jsx ADDED
@@ -0,0 +1,137 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ /**
2
+ * P2: New-user onboarding — pick 3–5 books to seed preferences.
3
+ * Shown when myCollection is empty and onboarding not completed.
4
+ */
5
+ import React, { useState, useEffect } from "react";
6
+ import { getOnboardingBooks } from "../api";
7
+
8
+ const PLACEHOLDER_IMG = "/content/cover-not-found.jpg";
9
+ const MIN_SELECT = 3;
10
+ const MAX_SELECT = 5;
11
+
12
+ const OnboardingModal = ({ onComplete, onAddFavorite, onSkip }) => {
13
+ const [books, setBooks] = useState([]);
14
+ const [selected, setSelected] = useState(new Set());
15
+ const [loading, setLoading] = useState(true);
16
+ const [error, setError] = useState("");
17
+
18
+ useEffect(() => {
19
+ getOnboardingBooks(24)
20
+ .then(setBooks)
21
+ .catch((e) => setError(e.message))
22
+ .finally(() => setLoading(false));
23
+ }, []);
24
+
25
+ const toggle = (isbn) => {
26
+ setSelected((prev) => {
27
+ const next = new Set(prev);
28
+ if (next.has(isbn)) {
29
+ next.delete(isbn);
30
+ } else if (next.size < MAX_SELECT) {
31
+ next.add(isbn);
32
+ }
33
+ return next;
34
+ });
35
+ };
36
+
37
+ const handleComplete = async () => {
38
+ if (selected.size < MIN_SELECT) return;
39
+ try {
40
+ for (const isbn of selected) {
41
+ await onAddFavorite(isbn);
42
+ }
43
+ localStorage.setItem("onboarding_complete", "true");
44
+ onComplete();
45
+ } catch (e) {
46
+ setError(e.message);
47
+ }
48
+ };
49
+
50
+ const canComplete = selected.size >= MIN_SELECT;
51
+
52
+ return (
53
+ <div className="fixed inset-0 z-50 flex items-center justify-center bg-black/50 p-4">
54
+ <div className="bg-white max-w-3xl w-full max-h-[90vh] overflow-hidden shadow-xl">
55
+ <div className="p-6 border-b border-[#eee]">
56
+ <h2 className="text-xl font-bold text-[#333]">Welcome — Pick Your Favorites</h2>
57
+ <p className="text-sm text-gray-500 mt-1">
58
+ Select 3–5 books you like to get personalized recommendations.
59
+ </p>
60
+ </div>
61
+ <div className="p-6 overflow-y-auto max-h-[50vh]">
62
+ {loading && (
63
+ <div className="text-center text-gray-400 py-8">Loading popular books...</div>
64
+ )}
65
+ {error && (
66
+ <div className="text-center text-red-500 py-4 text-sm">{error}</div>
67
+ )}
68
+ {!loading && !error && (
69
+ <div className="grid grid-cols-3 md:grid-cols-4 gap-4">
70
+ {books.map((book) => {
71
+ const isSelected = selected.has(book.isbn);
72
+ return (
73
+ <button
74
+ key={book.isbn}
75
+ type="button"
76
+ onClick={() => toggle(book.isbn)}
77
+ className={`text-left border-2 transition-all p-2 ${
78
+ isSelected ? "border-[#b392ac] bg-[#faf5f7]" : "border-[#eee] hover:border-[#ddd]"
79
+ }`}
80
+ >
81
+ <div className="aspect-[3/4] bg-gray-100 mb-2 overflow-hidden">
82
+ <img
83
+ src={book.thumbnail || PLACEHOLDER_IMG}
84
+ alt={book.title}
85
+ className="w-full h-full object-cover"
86
+ onError={(e) => {
87
+ e.target.onerror = null;
88
+ e.target.src = PLACEHOLDER_IMG;
89
+ }}
90
+ />
91
+ </div>
92
+ <p className="text-[10px] font-bold text-[#555] truncate" title={book.title}>
93
+ {book.title}
94
+ </p>
95
+ {isSelected && (
96
+ <span className="text-[10px] text-[#b392ac] font-bold">✓ Selected</span>
97
+ )}
98
+ </button>
99
+ );
100
+ })}
101
+ </div>
102
+ )}
103
+ </div>
104
+ <div className="p-6 border-t border-[#eee] flex justify-between items-center">
105
+ <span className="text-xs text-gray-500">
106
+ {selected.size} selected (min {MIN_SELECT}, max {MAX_SELECT})
107
+ </span>
108
+ <div className="flex gap-2">
109
+ {onSkip && (
110
+ <button
111
+ type="button"
112
+ onClick={() => {
113
+ localStorage.setItem("onboarding_complete", "true");
114
+ onSkip();
115
+ }}
116
+ className="px-4 py-2 text-sm text-gray-500 hover:text-gray-700"
117
+ >
118
+ Skip for now
119
+ </button>
120
+ )}
121
+ <button
122
+ onClick={handleComplete}
123
+ disabled={!canComplete}
124
+ className={`px-6 py-2 text-sm font-bold ${
125
+ canComplete ? "bg-[#b392ac] text-white" : "bg-gray-200 text-gray-400 cursor-not-allowed"
126
+ }`}
127
+ >
128
+ Start Exploring
129
+ </button>
130
+ </div>
131
+ </div>
132
+ </div>
133
+ </div>
134
+ );
135
+ };
136
+
137
+ export default OnboardingModal;