Uploaded model

License: apache-2.0
Finetuned from model : unsloth/meta-llama-3.1-8b-instruct-bnb-4bit

This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.

A BioMedical Snippet Extraction Model for Question Answering

Usage

from vllm import LLM, SamplingParams
from vllm.lora.request import LoRARequest

# Initialize vLLM Engine with LoRA support
model_path = "unsloth/meta-llama-3.1-8b-instruct-bnb-4bit"  
lora_path = "sag-uniroma2/llama3.1_adapter_biorag_snippet_extraction"      
lora_adapter_id = 1

llm = LLM(
    model=model_path,
    enable_lora=True,
    max_loras=1,
    max_lora_rank=64,
    gpu_memory_utilization=0.85,
    trust_remote_code=True,
    disable_custom_all_reduce=True,
    enforce_eager=True
)

# Setup LoRA request
lora_request_obj = LoRARequest(
    lora_name=str(lora_adapter_id),
    lora_int_id=lora_adapter_id,
    lora_path=lora_path
)

# Define sampling parameters
sampling_params = SamplingParams(temperature=0.0, max_tokens=256)

# Define instruction
instruction = """You are an expert biomedical researcher skilled in extracting relevant information from scientific literature. 
Your task is to identify and extract key snippets from a given PubMed abstract or title that provide useful information to answer a specific biomedical question.

Instructions:
- Understand the question: Carefully analyze the biomedical question to grasp its key concepts, entities, and relationships.
- Analyze the document: Read the provided title or abstract carefully, identifying sentences or phrases that contain relevant information.
- Extract the snippet: If a portion of the text is relevant, extract it exactly as it appears in the original text and enclose it within the tags [BS] and [ES].
- Handle irrelevant cases: If the document does not contain any relevant information, return only [BS] [ES] with no content inside.
- Be precise: Ensure that extracted snippets are complete, self-contained, and directly relevant, without modifying or adding words."""

# Prepare input
question = "YOUR_BIOMEDICAL_QUESTION_HERE"
document_text = "PUBMED_ABSTRACTS_HERE"

prompt = f"{instruction}\n\n# Question: {question}\n# Abstract/Title: {document_text}\n# Snippets:"

# Generate snippet extraction
outputs = llm.generate(
    [prompt],
    sampling_params,
    lora_request=lora_request_obj
)

# Parse and extract snippet
generated_text = outputs[0].outputs[0].text
snippet = generated_text.strip()

# Remove EOS tokens
common_eos_tokens = ["<|eot_id|>", "</s>", "<|endoftext|>"]
for eos in common_eos_tokens:
    if snippet.endswith(eos):
        snippet = snippet[:-len(eos)].strip()

# Extract content between tags
import re
extracted_snippets = re.findall(r'\[BS\](.*?)\[ES\]', snippet, re.DOTALL)
for snippet_content in extracted_snippets:
    clean_snippet = snippet_content.strip()
    if clean_snippet:
        print(f"Extracted snippet: {clean_snippet}")

Description

This Snippet Extraction Module is a fine-tuned language model powered by vLLM inference engine, designed to automatically extract relevant snippets from PubMed abstracts and titles in response to biomedical questions. The model uses parameter-efficient LoRA (Low-Rank Adaptation) fine-tuning on the Llama-3.1-8B-Instruct as a base model.

Model Details

Base Model: Llama-3.1-8B-Instruct
Fine-tuning Method: LoRA (Low-Rank Adaptation)
LoRA Rank: 64
Inference Engine: vLLM (for optimized generation)

Key Features

✅ Biomedical Domain Expertise: Fine-tuned on BioASQ biomedical question-answering dataset
✅ Exact Span Extraction: Extracts text exactly as it appears with [BS] and [ES] tags
✅ High-Performance Inference: vLLM engine enables batch processing and fast generation
✅ Dual-Document Processing: Independently processes both titles and abstracts for comprehensive extraction

Performance

Tested on BioASQ 13B Phase A test set
Optimized for precision and recall in biomedical snippet retrieval
Batch processing capability for efficient document-scale extraction

Use Cases

Biomedical Question Answering: Extract supporting evidence snippets for QA systems
Literature Mining: Identify relevant passages in biomedical literature repositories
Clinical Decision Support: Extract relevant clinical evidence from scientific literature
Document Summarization: Identify key information-bearing passages in scientific papers

Input Format

The model expects a formatted prompt with:

instruction: Detailed task definition and extraction guidelines
question: The biomedical question requiring an answer
document_text: PubMed abstracts or titles to analyze

Output Format

Extracts snippets enclosed in tags:

[BS] ... [ES]: Extracted relevant snippet
[BS] [ES]: Empty tag pair when no relevant information found

GitHub

For implementation details, training scripts, and integration guides:

GitHub Repository: LocalBioRAG

GitHub Repository: BioASQ2025-UNITOR)

Citation

If you use this model, please cite:

@InProceedings{10.1007/978-3-032-21324-2_31,
author="Borazio, Federico
and Labbate, Francesco
and Croce, Danilo
and Basili, Roberto",
editor="Campos, Ricardo
and Jatowt, Adam
and Lan, Yanyan
and Aliannejadi, Mohammad
and Bauer, Christine
and MacAvaney, Sean
and Anand, Avishek
and Ren, Zhaochun
and Verberne, Suzan
and Bai, Nan
and Mansoury, Masoud",
title="Integrating AI and IR Paradigms for Sustainable and Trustworthy Accurate Access to Large Scale Biomedical Information",
booktitle="Advances in Information Retrieval",
year="2026",
publisher="Springer Nature Switzerland",
address="Cham",
pages="398--412",
isbn="978-3-032-21324-2"
}

@inproceedings{unitor,
    title={{UniTor at BioASQ 2025: Modular Biomedical QA with Synthetic Snippets and Multiple Task Answer Generation}},
    author={Borazio, Federico and Shcherbakov, Andriy and Croce, Danilo and Basili, Roberto},
    year=2025,
    booktitle={CLEF 2025 Working Notes},
    editor= {Faggioli, Guglielmo and  Ferro,  Nicola and  Rosso,  Paolo and  Spina, Damiano}
}

Disclaimer

This model is fine-tuned for biomedical snippet extraction from PubMed literature. While it performs well on BioASQ data, results may vary on other biomedical datasets or domains. The model is optimized for precision in identifying relevant text spans. Always validate extracted snippets for critical applications in clinical or research settings. For production use, consider the computational requirements: vLLM inference requires adequate GPU memory (recommended ≥24GB for batch processing).

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including sag-uniroma2/llama3.1_adapter_biorag_snippet_extraction

LocalBioRAG

Collection

https://github.com/crux82/LocalBioRag • 2 items • Updated Mar 27