| | --- |
| | base_model: google-bert/bert-base-uncased |
| | datasets: |
| | - prithivMLmods/Spam-Text-Detect-Analysis |
| | license: apache-2.0 |
| | tags: |
| | - embedding_space_map |
| | - BaseLM:google-bert/bert-base-uncased |
| | --- |
| | |
| | # ESM prithivMLmods/Spam-Text-Detect-Analysis |
| |
|
| | <!-- Provide a quick summary of what the model is/does. --> |
| |
|
| |
|
| |
|
| | ## Model Details |
| |
|
| | ### Model Description |
| |
|
| | <!-- Provide a longer summary of what this model is. --> |
| |
|
| | ESM |
| |
|
| | - **Developed by:** [Unknown] |
| | - **Model type:** ESM |
| | - **Base Model:** google-bert/bert-base-uncased |
| | - **Intermediate Task:** prithivMLmods/Spam-Text-Detect-Analysis |
| | - **ESM architecture:** [More Information Needed] (The default architecture is a single dense layer.) |
| | - **ESM embedding dimension:** [More Information Needed] |
| | - **Language(s) (NLP):** [More Information Needed] |
| | - **License:** Apache-2.0 license |
| |
|
| | ## Training Details |
| |
|
| | ### Intermediate Task |
| | - **Task ID:** prithivMLmods/Spam-Text-Detect-Analysis |
| | - **Subset [optional]:** |
| | - **Text Column:** |
| | - **Label Column:** |
| | - **Dataset Split:** [More Information Needed] |
| | - **Sample size [optional]:** |
| | - **Sample seed [optional]:** |
| |
|
| | ### Training Procedure [optional] |
| |
|
| | <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. --> |
| |
|
| | #### Language Model Training Hyperparameters [optional] |
| | - **Epochs:** [More Information Needed] |
| | - **Batch size:** [More Information Needed] |
| | - **Learning rate:** [More Information Needed] |
| | - **Weight Decay:** [More Information Needed] |
| | - **Optimizer**: [More Information Needed] |
| |
|
| | ### ESM Training Hyperparameters [optional] |
| | - **Epochs:** 13 |
| | - **Batch size:** 32 |
| | - **Learning rate:** 0.034702669886504146 |
| | - **Weight Decay:** 1.2674255898937214e-05 |
| | - **Optimizer**: [More Information Needed] |
| |
|
| |
|
| | ### Additional trainiung details [optional] |
| |
|
| |
|
| | ## Model evaluation |
| |
|
| | ### Evaluation of fine-tuned language model [optional] |
| |
|
| |
|
| | ### Evaluation of ESM [optional] |
| | MSE: |
| |
|
| | ### Additional evaluation details [optional] |
| |
|
| |
|
| |
|
| | ## What are Embedding Space Maps? |
| |
|
| | <!-- This section describes the evaluation protocols and provides the results. --> |
| | Embedding Space Maps (ESMs) are neural networks that approximate the effect of fine-tuning a language model on a task. They can be used to quickly transform embeddings from a base model to approximate how a fine-tuned model would embed the the input text. |
| | ESMs can be used for intermediate task selection with the ESM-LogME workflow. |
| |
|
| | ## How can I use Embedding Space Maps for Intermediate Task Selection? |
| | [](https://pypi.org/project/hf-dataset-selector) |
| |
|
| | We release **hf-dataset-selector**, a Python package for intermediate task selection using Embedding Space Maps. |
| |
|
| | **hf-dataset-selector** fetches ESMs for a given language model and uses it to find the best dataset for applying intermediate training to the target task. ESMs are found by their tags on the Huggingface Hub. |
| |
|
| | ```python |
| | from hfselect import Dataset, compute_task_ranking |
| | |
| | # Load target dataset from the Hugging Face Hub |
| | dataset = Dataset.from_hugging_face( |
| | name="stanfordnlp/imdb", |
| | split="train", |
| | text_col="text", |
| | label_col="label", |
| | is_regression=False, |
| | num_examples=1000, |
| | seed=42 |
| | ) |
| | |
| | # Fetch ESMs and rank tasks |
| | task_ranking = compute_task_ranking( |
| | dataset=dataset, |
| | model_name="bert-base-multilingual-uncased" |
| | ) |
| | |
| | # Display top 5 recommendations |
| | print(task_ranking[:5]) |
| | ``` |
| |
|
| | For more information on how to use ESMs please have a look at the [official Github repository](https://github.com/davidschulte/hf-dataset-selector). |
| |
|
| | ## Citation |
| |
|
| |
|
| | <!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. --> |
| | If you are using this Embedding Space Maps, please cite our [paper](https://arxiv.org/abs/2410.15148). |
| |
|
| | **BibTeX:** |
| |
|
| |
|
| | ``` |
| | @misc{schulte2024moreparameterefficientselectionintermediate, |
| | title={Less is More: Parameter-Efficient Selection of Intermediate Tasks for Transfer Learning}, |
| | author={David Schulte and Felix Hamborg and Alan Akbik}, |
| | year={2024}, |
| | eprint={2410.15148}, |
| | archivePrefix={arXiv}, |
| | primaryClass={cs.CL}, |
| | url={https://arxiv.org/abs/2410.15148}, |
| | } |
| | ``` |
| |
|
| |
|
| | **APA:** |
| |
|
| | ``` |
| | Schulte, D., Hamborg, F., & Akbik, A. (2024). Less is More: Parameter-Efficient Selection of Intermediate Tasks for Transfer Learning. arXiv preprint arXiv:2410.15148. |
| | ``` |
| |
|
| | ## Additional Information |
| |
|
| |
|