arxiv:2512.11192
Pedro Ortiz Suarez
AI & ML interests
Language modeling, parsing, sequence tagging, NER, historical languages.
Recent Activity
published
a dataset
2 days ago
commoncrawl/CommonLID
updated
a dataset
3 days ago
commoncrawl/CommonLID
authored
a paper
14 days ago
SciLaD: A Large-Scale, Transparent, Reproducible Dataset for Natural Scientific Language Processing