Add model_max_length (32768) to YAML and Model Details section

#3
by dilawarm - opened
Files changed (1) hide show
  1. README.md +24 -13
README.md CHANGED
@@ -15,6 +15,7 @@ tags:
15
  - medical
16
  - multilingual
17
  library_name: sentence-transformers
 
18
  ---
19
 
20
  <img src="https://i.imgur.com/oxvhvQu.png"/>
@@ -33,6 +34,16 @@ The model supports flexible dimension projections (2560, 1280, 640, 320, 160, 80
33
 
34
  This model is released under a non-commercial license. If you'd like a commercial license, please contact us at contact@zeroentropy.dev.
35
 
 
 
 
 
 
 
 
 
 
 
36
  ## How to Use
37
 
38
  ```python
@@ -69,17 +80,17 @@ The model can also be used through ZeroEntropy's [/models/embed](https://docs.ze
69
 
70
  NDCG@10 scores between `zembed-1` and competing embedding models, averaged across public and private benchmarks per domain. Full per-benchmark breakdown [here](https://docs.google.com/spreadsheets/d/1qFXGZLMg6-O5tVLIJS3tpf5QNJxCHiiQtj35dZub4vY/edit?gid=0#gid=0).
71
 
72
- | Domain | ZeroEntropy zembed-1 | voyage-4-nano | Qwen3 4B | Cohere Embed v4 | gemini-embed-001 | jina-v5-small | OpenAI Large | bge-m3 |
73
  |------------------|----------------------|---------------|----------|-----------------|-------------------|---------------|--------------|--------|
74
- | Finance | **0.4476** | 0.4227 | 0.3715 | 0.3670 | 0.3291 | 0.3576 | 0.3291 | 0.3085 |
75
- | Healthcare | **0.6260** | 0.5356 | 0.5134 | 0.4750 | 0.5008 | 0.5132 | 0.5315 | 0.3620 |
76
- | Legal | **0.6723** | 0.5957 | 0.5858 | 0.5894 | 0.6069 | 0.5716 | 0.5099 | 0.5207 |
77
- | Conversational | **0.5385** | 0.4045 | 0.4034 | 0.4244 | 0.4247 | 0.4430 | 0.3988 | 0.3296 |
78
- | Manufacturing | **0.5556** | 0.4857 | 0.4932 | 0.4919 | 0.4664 | 0.4725 | 0.4736 | 0.3736 |
79
- | Web Search | 0.6165 | 0.5977 | 0.6914 | **0.7242** | 0.5881 | 0.6772 | 0.6750 | 0.6311 |
80
- | Code | **0.6452** | 0.6415 | 0.6379 | 0.6277 | 0.6305 | 0.6354 | 0.6155 | 0.5584 |
81
- | STEM & Math | **0.5283** | 0.5012 | 0.5219 | 0.4698 | 0.4840 | 0.3780 | 0.3905 | 0.3399 |
82
- | Enterprise | **0.3750** | 0.3600 | 0.2935 | 0.2915 | 0.3224 | 0.3012 | 0.3307 | 0.2213 |
83
- | **Average** | **0.5561** | **0.5050** | **0.5013** | **0.4957** | **0.4837** | **0.4833** | **0.4727** | **0.4050** |
84
-
85
- <img src="assets/zembed_eval_chart.png" alt="Bar chart comparing zembed-1 NDCG@10 scores against competing embedding models across domains" width="1000"/>
 
15
  - medical
16
  - multilingual
17
  library_name: sentence-transformers
18
+ model_max_length: 32768
19
  ---
20
 
21
  <img src="https://i.imgur.com/oxvhvQu.png"/>
 
34
 
35
  This model is released under a non-commercial license. If you'd like a commercial license, please contact us at contact@zeroentropy.dev.
36
 
37
+ ## Model Details
38
+
39
+ | Property | Value |
40
+ |---|---|
41
+ | Parameters | 4B |
42
+ | Context Length | 32,768 tokens (32k) |
43
+ | Base Model | Qwen/Qwen3-4B |
44
+ | Embedding Dimensions | 2560, 1280, 640, 320, 160, 80, 40 |
45
+ | License | CC-BY-NC-4.0 |
46
+
47
  ## How to Use
48
 
49
  ```python
 
80
 
81
  NDCG@10 scores between `zembed-1` and competing embedding models, averaged across public and private benchmarks per domain. Full per-benchmark breakdown [here](https://docs.google.com/spreadsheets/d/1qFXGZLMg6-O5tVLIJS3tpf5QNJxCHiiQtj35dZub4vY/edit?gid=0#gid=0).
82
 
83
+ | Domain | ZeroEntropy zembed-1 | voyage-4-nano | Qwen3 4B | Cohere Embed v4 | gemini-embed-001 | jina-v5-small | OpenAI Large | bge-m3 |
84
  |------------------|----------------------|---------------|----------|-----------------|-------------------|---------------|--------------|--------|
85
+ | Finance | **0.4476** | 0.4227 | 0.3715 | 0.3670 | 0.3291 | 0.3576 | 0.3291 | 0.3085 |
86
+ | Healthcare | **0.6260** | 0.5356 | 0.5134 | 0.4750 | 0.5008 | 0.5132 | 0.5315 | 0.3620 |
87
+ | Legal | **0.6723** | 0.5957 | 0.5858 | 0.5894 | 0.6069 | 0.5716 | 0.5099 | 0.5207 |
88
+ | Conversational | **0.5385** | 0.4045 | 0.4034 | 0.4244 | 0.4247 | 0.4430 | 0.3988 | 0.3296 |
89
+ | Manufacturing | **0.5556** | 0.4857 | 0.4932 | 0.4919 | 0.4664 | 0.4725 | 0.4736 | 0.3736 |
90
+ | Web Search | 0.6165 | 0.5977 | 0.6914 | **0.7242** | 0.5881 | 0.6772 | 0.6750 | 0.6311 |
91
+ | Code | **0.6452** | 0.6415 | 0.6379 | 0.6277 | 0.6305 | 0.6354 | 0.6155 | 0.5584 |
92
+ | STEM & Math | **0.5283** | 0.5012 | 0.5219 | 0.4698 | 0.4840 | 0.3780 | 0.3905 | 0.3399 |
93
+ | Enterprise | **0.3750** | 0.3600 | 0.2935 | 0.2915 | 0.3224 | 0.3012 | 0.3307 | 0.2213 |
94
+ | **Average** | **0.5561** | **0.5050** | **0.5013** | **0.4957** | **0.4837** | **0.4833** | **0.4727** | **0.4050** |
95
+
96
+ Bar chart comparing zembed-1 NDCG@10 scores against competing embedding models across domains