SkillScout Large - Job-to-Skill Dense Retriever

SkillScout Large is a dense bi-encoder for retrieving relevant skills from a job title. Given a job title (e.g., "Data Scientist"), it produces a 1024-dimensional embedding and retrieves the most semantically relevant skills from the ESCO skill gazetteer (9,052 skills) via cosine similarity.

This is Stage 1 of the TalentGuide two-stage job-skill matching pipeline, trained for TalentCLEF 2026 Task B.

Best pipeline result (TalentCLEF 2026 validation set): nDCG@10 graded = 0.6896 | nDCG@10 binary = 0.7330 when combined with a fine-tuned cross-encoder at blend alpha=0.7. Bi-encoder alone: nDCG@10 graded = 0.3621 | MAP = 0.4545


Model Summary

Property Value
Base model jjzha/esco-xlm-roberta-large
Architecture XLM-RoBERTa-large + mean pooling
Embedding dimension 1024
Max sequence length 64 tokens
Training loss Multiple Negatives Ranking (MNR)
Training pairs 93,720 (ESCO job-skill pairs, essential + optional)
Epochs 3
Best checkpoint Step 3500 (by validation nDCG@10)
Hardware NVIDIA RTX 3070 8GB, fp16 AMP

What is TalentCLEF Task B?

TalentCLEF 2026 Task B is a graded information-retrieval shared task:

  • Query: a job title (e.g., "Electrician")
  • Corpus: 9,052 ESCO skills (e.g., "install electric switches")
  • Relevance levels: 2 = Core, 1 = Contextual, 0 = Non-relevant
  • Primary metric: nDCG with graded relevance (core=2, contextual=1)

Usage

Installation

pip install sentence-transformers faiss-cpu

Encode and Compare

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("talentguide/skillscout-large")

job    = "Data Scientist"
skills = ["data science", "machine learning", "install electric switches"]

embs   = model.encode([job] + skills, normalize_embeddings=True)
scores = embs[0] @ embs[1:].T

for skill, score in zip(skills, scores):
    print(f"{score:.3f}  {skill}")
# 0.872  data science
# 0.731  machine learning
# 0.112  install electric switches

Full Retrieval with FAISS (Recommended)

from sentence_transformers import SentenceTransformer
import faiss, numpy as np

model = SentenceTransformer("talentguide/skillscout-large")

# Build index once over your skill corpus
skill_texts = [...]  # list of skill names

embs = model.encode(skill_texts, batch_size=128,
                    normalize_embeddings=True,
                    show_progress_bar=True).astype(np.float32)

index = faiss.IndexFlatIP(embs.shape[1])  # inner product on L2-normed = cosine
index.add(embs)

job_title = "Software Engineer"
q = model.encode([job_title], normalize_embeddings=True).astype(np.float32)
scores, idxs = index.search(q, k=50)

for rank, (idx, score) in enumerate(zip(idxs[0], scores[0]), 1):
    print(f"{rank:3d}. [{score:.4f}]  {skill_texts[idx]}")

Demo Output

Software Engineer
   1. [0.942]  define software architecture
   2. [0.938]  software frameworks
   3. [0.935]  create software design

Data Scientist
   1. [0.951]  data science
   2. [0.921]  establish data processes
   3. [0.919]  create data models

Electrician
   1. [0.944]  install electric switches
   2. [0.938]  install electricity sockets
   3. [0.930]  use electrical wire tools

Two-Stage Pipeline Integration

Job title
   |
   v
[SkillScout Large]         <- this model
   |  top-200 candidates via FAISS ANN
   v
[Cross-encoder re-ranker]
   |  fine-grained re-scoring
   v
Final ranked list (graded: core > contextual > irrelevant)

Blend formula (alpha=0.7 gives best validation results):

final_score = alpha * biencoder_score + (1 - alpha) * crossencoder_score

Training Details

Data

Source: ESCO occupational ontology, TalentCLEF 2026 training split.

Count
Job-skill pairs (essential) ~57,500
Job-skill pairs (optional) ~57,200
Total InputExamples 93,720
Validation queries 304
Validation corpus 9,052 skills
Validation qrels 56,417

Each ESCO job has 5-15 title aliases; skills have multiple phrasings. Optional pairs are downsampled to 50% of essential count to maintain class balance.

Hyperparameters

Loss           : MultipleNegativesRankingLoss (scale=20, cos_sim)
Batch size     : 64  (63 in-batch negatives per anchor)
Epochs         : 3
Warmup         : 10% of steps (~440 steps)
Optimizer      : AdamW fused
Learning rate  : 5e-5, linear decay
Precision      : fp16 AMP
Max seq len    : 64 tokens
Best model     : saved by cosine-nDCG@10 on validation

Training Curve

Epoch Step Train Loss nDCG@10 val MAP@100 val
0.34 500 2.9232 0.3430 -
0.68 1000 2.1179 0.3424 -
1.00 1465 - 0.3676 0.1758
1.37 2000 1.7070 0.3692 -
1.71 2500 1.6366 0.3744 -
2.00 2930 - 0.3717 0.1780
2.39 3500 1.4540 0.3769 0.1808

Validation Metrics (best checkpoint, step 3500)

Metric Value
nDCG@10 0.4830
nDCG@50 0.4240
nDCG@100 0.3769
MAP@100 0.1825
MRR@10 0.6657
Accuracy@1 0.5099
Accuracy@3 0.7993
Accuracy@5 0.8914
Accuracy@10 0.9474

Evaluated with InformationRetrievalEvaluator (binary: any qrel > 0 = relevant).

Pipeline Results (graded relevance, full 9052-skill ranking)

Run nDCG@10 graded nDCG@10 binary MAP
Zero-shot jjzha/esco-xlm-roberta-large 0.2039 0.2853 0.2663
SkillScout Large (bi-encoder only) 0.3621 0.4830 0.4545
SkillScout Large + cross-encoder (alpha=0.7) 0.6896 0.7330 0.2481

Competitive Context (TalentCLEF 2025 Task B)

Team MAP (test) Approach
pjmathematician (winner 2025) 0.36 GTE 7B + contrastive + LLM-augmented data
NLPnorth (3rd of 14, 2025) 0.29 3-class discriminative classification
SkillScout Large (2026 val, Stage 1 only) 0.4545 MNR fine-tuned bi-encoder

Limitations

  • English only - trained on ESCO EN labels.
  • ESCO-domain optimised - transfer to O*NET or custom taxonomies may require fine-tuning.
  • Max 64 tokens - reduce long descriptions to a concise job title.
  • Graded distinction - the bi-encoder alone does not reliably separate core vs contextual skills; a cross-encoder re-ranker is recommended for graded nDCG.

Citation

@misc{talentguide-skillscout-2026,
  title  = {SkillScout Large: Dense Job-to-Skill Retrieval for TalentCLEF 2026},
  author = {TalentGuide},
  year   = {2026},
  url    = {https://huggingface.co/talentguide/skillscout-large}
}

@misc{talentclef2026taskb,
  title  = {TalentCLEF 2026 Task B: Job-Skill Matching},
  author = {TalentCLEF Organizers},
  year   = {2026},
  url    = {https://talentclef.github.io/}
}

Framework Versions

  • Python 3.12.10 | Sentence Transformers 5.3.0 | Transformers 5.5.0
  • PyTorch 2.11.0+cu128 | Accelerate 1.13.0 | Tokenizers 0.22.2
Downloads last month
-
Safetensors
Model size
0.6B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for talentguide/skillscout-large

Finetuned
(4)
this model

Evaluation results