An Empirical Recipe for Universal Phone Recognition
Paper • 2603.29042 • Published • 1
PhoneticXeus is a multilingual phone recognition model using self-conditioned CTC on the XEUS speech encoder, trained on IPAPack++ covering 70+ languages with IPA transcriptions.
| File | Description |
|---|---|
checkpoint-22000.ckpt |
Model checkpoint (PyTorch Lightning) |
ipa_vocab.json |
IPA vocabulary (token-to-id mapping) |
config_tree.log |
Hydra config used for training |
git clone git@github.com:changelinglab/PhoneticXeus.git
cd PhoneticXeus
make install
source .venv/bin/activate
from huggingface_hub import hf_hub_download
ckpt_path = hf_hub_download("changelinglab/PhoneticXeus", "checkpoint-22000.ckpt")
import torch
import torchaudio
from src.model.xeusphoneme.builders import build_xeus_pr_inference
# Build inference object
inference = build_xeus_pr_inference(
work_dir="exp/cache/xeus", # cache dir for XEUS base weights
checkpoint=ckpt_path, # path to downloaded checkpoint
vocab_file="src/model/xeusphoneme/resources/ipa_vocab.json",
hf_repo="espnet/xeus", # base encoder weights
device="cuda" if torch.cuda.is_available() else "cpu",
)
# Transcribe audio
waveform, sr = torchaudio.load("path/to/audio.wav")
if sr != 16000:
waveform = torchaudio.functional.resample(waveform, sr, 16000)
results = inference(waveform.squeeze(0))
print(results[0]["processed_transcript"])
# e.g., "h ə l oʊ w ɝ l d"
python src/main.py \
experiment=inference/transcribe_xeuspr_selfctc \
data=powsmeval data.dataset_name=doreco \
inference.inference_runner.checkpoint=path/to/checkpoint-22000.ckpt
See Running Inference for SLURM-based distributed inference.
Evaluated with PhoneRecognitionEvaluator from PhoneticXeus:
If you use this model, please cite:
@misc{pxeus26,
title={An Empirical Recipe for Universal Phone Recognition},
author={Shikhar Bharadwaj and Chin-Jou Li and Kwanghee Choi and Eunjung Yeo and William Chen and Shinji Watanabe and David R. Mortensen},
year={2026},
eprint={2603.29042},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2603.29042},
}