metadata
license: llama3.1
base_model: meta-llama/Llama-3.1-8B-Instruct
tags:
- sentiment-analysis
- amazon-reviews
- llama-3.1
- peft
- lora
- qlora
- text-classification
datasets:
- McAuley-Lab/Amazon-Reviews-2023
language:
- en
metrics:
- accuracy
- f1
- precision
- recall
pipeline_tag: text-classification
LLaMA 3.1-8B Sentiment Analysis: All Beauty
Fine-tuned LLaMA 3.1-8B-Instruct for sentiment analysis on Amazon product reviews.
Model Description
This model is a QLoRA fine-tuned version of meta-llama/Llama-3.1-8B-Instruct for binary (negative/positive) sentiment classification on Amazon All Beauty reviews.
Training Configuration
| Parameter | Value |
|---|---|
| Base Model | meta-llama/Llama-3.1-8B-Instruct |
| Training Phase | Baseline |
| Category | All Beauty |
| Classification | 2-class |
| Training Samples | 150,000 |
| Epochs | 1 |
| Sequence Length | 384 tokens |
| LoRA Rank (r) | 128 |
| LoRA Alpha | 32 |
| Quantization | 4-bit NF4 |
| Attention | SDPA |
Performance Metrics
Overall
| Metric | Score |
|---|---|
| Accuracy | 0.9644 (96.44%) |
| Macro Precision | 0.9652 |
| Macro Recall | 0.9642 |
| Macro F1 | 0.9644 |
Per-Class
| Class | Precision | Recall | F1 |
|---|---|---|---|
| Negative | 0.9485 | 0.9830 | 0.9654 |
| Positive | 0.9819 | 0.9454 | 0.9633 |
Confusion Matrix
Pred Neg Pred Pos
True Neg 2486 43
True Pos 135 2336
Usage
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
# Load model
base_model = AutoModelForCausalLM.from_pretrained(
"meta-llama/Llama-3.1-8B-Instruct",
torch_dtype=torch.bfloat16,
device_map="auto"
)
model = PeftModel.from_pretrained(base_model, "innerCircuit/llama3-sentiment-All-Beauty-binary-baseline-150k")
tokenizer = AutoTokenizer.from_pretrained("innerCircuit/llama3-sentiment-All-Beauty-binary-baseline-150k")
# Inference
def predict_sentiment(text):
messages = [
{"role": "system", "content": "You are a sentiment classifier. Classify as negative or positive. Respond with one word."},
{"role": "user", "content": text}
]
inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)
outputs = model.generate(inputs, max_new_tokens=5, do_sample=False)
return tokenizer.decode(outputs[0][inputs.shape[-1]:], skip_special_tokens=True).strip()
# Example
print(predict_sentiment("This product is amazing! Best purchase ever."))
# Output: positive
Training Data
| Attribute | Value |
|---|---|
| Dataset | Amazon Reviews 2023 |
| Category | All Beauty |
| Training Samples | 150,000 |
| Evaluation Samples | 10,000 |
| Class Balance | Equal samples per sentiment class |
Research Context
This model is part of a research project investigating LLM poisoning attacks, based on methodologies from Souly et al. (2025). The fine-tuned baseline establishes performance benchmarks prior to introducing adversarial samples.
References
- Souly, A., Rando, J., et al. (2025). Poisoning attacks on LLMs require a near-constant number of poison samples. arXiv:2510.07192
- Hou, Y., et al. (2024). Bridging Language and Items for Retrieval and Recommendation. arXiv:2403.03952
Citation
@misc{llama3-sentiment-All-Beauty-baseline,
author = {Govinda Reddy, Akshay and Pranav},
title = {LLaMA 3.1 Sentiment Analysis for Amazon Reviews},
year = {2024},
publisher = {HuggingFace},
howpublished = {\url{https://huggingface.co/innerCircuit/llama3-sentiment-All-Beauty-binary-baseline-150k}}
}
License
This model is released under the Llama 3.1 Community License.
Generated: 2026-01-12 09:12:45 UTC