---
license: llama3.1
base_model: meta-llama/Llama-3.1-8B-Instruct
tags:
- sentiment-analysis
- amazon-reviews
- llama-3.1
- peft
- lora
- qlora
- text-classification
datasets:
- McAuley-Lab/Amazon-Reviews-2023
language:
- en
metrics:
- accuracy
- f1
- precision
- recall
pipeline_tag: text-classification
---

# LLaMA 3.1-8B Sentiment Analysis: All Beauty

Fine-tuned LLaMA 3.1-8B-Instruct for sentiment analysis on Amazon product reviews.

## Model Description

This model is a QLoRA fine-tuned version of `meta-llama/Llama-3.1-8B-Instruct` for binary (negative/positive) sentiment classification on Amazon All Beauty reviews.

## Training Configuration

| Parameter | Value |
|-----------|-------|
| Base Model | meta-llama/Llama-3.1-8B-Instruct |
| Training Phase | Baseline |
| Category | All Beauty |
| Classification | 2-class |
| Training Samples | 150,000 |
| Epochs | 1 |
| Sequence Length | 384 tokens |
| LoRA Rank (r) | 128 |
| LoRA Alpha | 32 |
| Quantization | 4-bit NF4 |
| Attention | SDPA |

## Performance Metrics

### Overall

| Metric | Score |
|--------|-------|
| Accuracy | 0.9644 (96.44%) |
| Macro Precision | 0.9652 |
| Macro Recall | 0.9642 |
| Macro F1 | 0.9644 |

### Per-Class

| Class | Precision | Recall | F1 |
|-------|-----------|--------|----|
| Negative | 0.9485 | 0.9830 | 0.9654 |
| Positive | 0.9819 | 0.9454 | 0.9633 |

### Confusion Matrix

```
              Pred Neg  Pred Pos
True Neg       2486        43
True Pos        135      2336
```

## Usage

```python
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# Load model
base_model = AutoModelForCausalLM.from_pretrained(
    "meta-llama/Llama-3.1-8B-Instruct",
    torch_dtype=torch.bfloat16,
    device_map="auto"
)
model = PeftModel.from_pretrained(base_model, "innerCircuit/llama3-sentiment-All-Beauty-binary-baseline-150k")
tokenizer = AutoTokenizer.from_pretrained("innerCircuit/llama3-sentiment-All-Beauty-binary-baseline-150k")

# Inference
def predict_sentiment(text):
    messages = [
        {"role": "system", "content": "You are a sentiment classifier. Classify as negative or positive. Respond with one word."},
        {"role": "user", "content": text}
    ]
    inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)
    outputs = model.generate(inputs, max_new_tokens=5, do_sample=False)
    return tokenizer.decode(outputs[0][inputs.shape[-1]:], skip_special_tokens=True).strip()

# Example
print(predict_sentiment("This product is amazing! Best purchase ever."))
# Output: positive
```

## Training Data

| Attribute | Value |
|-----------|-------|
| Dataset | [Amazon Reviews 2023](https://huggingface.co/datasets/McAuley-Lab/Amazon-Reviews-2023) |
| Category | All Beauty |
| Training Samples | 150,000 |
| Evaluation Samples | 10,000 |
| Class Balance | Equal samples per sentiment class |

## Research Context

This model is part of a research project investigating LLM poisoning attacks, based on methodologies from Souly et al. (2025). The fine-tuned baseline establishes performance benchmarks prior to introducing adversarial samples.

## References

- Souly, A., Rando, J., et al. (2025). Poisoning attacks on LLMs require a near-constant number of poison samples. arXiv:2510.07192
- Hou, Y., et al. (2024). Bridging Language and Items for Retrieval and Recommendation. arXiv:2403.03952

## Citation

```bibtex
@misc{llama3-sentiment-All-Beauty-baseline,
  author = {Govinda Reddy, Akshay and Pranav},
  title = {LLaMA 3.1 Sentiment Analysis for Amazon Reviews},
  year = {2024},
  publisher = {HuggingFace},
  howpublished = {\url{https://huggingface.co/innerCircuit/llama3-sentiment-All-Beauty-binary-baseline-150k}}
}
```

## License

This model is released under the [Llama 3.1 Community License](https://llama.meta.com/llama3_1/license/).

---

Generated: 2026-01-12 09:12:45 UTC