aoliverg commited on
Commit
b73eb26
·
verified ·
1 Parent(s): 5178618

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +36 -10
README.md CHANGED
@@ -1,10 +1,36 @@
1
- ---
2
- title: README
3
- emoji: 🏃
4
- colorFrom: yellow
5
- colorTo: pink
6
- sdk: static
7
- pinned: false
8
- ---
9
-
10
- Edit this `README.md` markdown file to author your organization card.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ## Introduction
2
+
3
+ LLMTrad-IBE is a strategic research initiative dedicated to overcoming the digital divide affecting the minority Romance languages of the Iberian Peninsula. By leveraging state-of-the-art Natural Language Processing (NLP), we aim to ensure these languages are not left behind in the era of Artificial Intelligence.
4
+
5
+ This project is a key component of the AI-TraLow coordinated framework (AI-Driven Translation for Low-Resource Languages and Cultures), supported by the Spanish Ministry of Science, Innovation, and Universities (MCIU/AEI/10.13039/501100011033/FEDER, UE) under reference PID2024-158157OB-C33.
6
+
7
+ ## Mission and Scope
8
+
9
+ Our research focuses on the development, adaptation, and evaluation of Large Language Models (LLMs) for four specific linguistic varieties characterized by limited digital resources:
10
+
11
+ * Asturian
12
+ * Aragonese
13
+ * Aranese
14
+ * Eonavian
15
+
16
+ ## Strategic Research Areas
17
+
18
+ We employ a hybrid methodology that integrates the structural precision of symbolic systems with the generative power of neural architectures:
19
+
20
+ * LLM Specialization: Fine-tuning decoder-only architectures and exploring parameter-efficient strategies (PEFT) for translation.
21
+ * Knowledge Distillation: Developing compact and efficient models to facilitate sustainable deployment in standard computing environments.
22
+ * Resource Synthesis: Expanding Apertium-based lexical resources and curating high-quality benchmarks, including FLORES+ and NTREX adaptations.
23
+ * Ethical AI: Implementing rigorous evaluation frameworks to detect and mitigate gender bias and ensure linguistic authenticity.
24
+
25
+ ## Collaborative Network
26
+
27
+ LLMTrad-IBE thrives on the synergy between leading academic institutions:
28
+
29
+ * Universitat Oberta de Catalunya (UOC) — Coordinating Institution
30
+ * Universitat Autònoma de Barcelona (UAB)
31
+ * Universidad de Oviedo
32
+ * Universidad de Zaragoza
33
+
34
+ ## Commitment to Open Science
35
+
36
+ As part of our commitment to the scientific community and linguistic heritage, all models, datasets, and tools developed within this project are released under permissive open-source licenses.