File size: 3,923 Bytes
e28d82f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
---
license: llama3.1
base_model: meta-llama/Llama-3.1-8B-Instruct
tags:
- sentiment-analysis
- amazon-reviews
- llama-3.1
- peft
- lora
- qlora
- text-classification
datasets:
- McAuley-Lab/Amazon-Reviews-2023
language:
- en
metrics:
- accuracy
- f1
- precision
- recall
pipeline_tag: text-classification
---

# LLaMA 3.1-8B Sentiment Analysis: All Beauty

Fine-tuned LLaMA 3.1-8B-Instruct for sentiment analysis on Amazon product reviews.

## Model Description

This model is a QLoRA fine-tuned version of `meta-llama/Llama-3.1-8B-Instruct` for binary (negative/positive) sentiment classification on Amazon All Beauty reviews.

## Training Configuration

| Parameter | Value |
|-----------|-------|
| Base Model | meta-llama/Llama-3.1-8B-Instruct |
| Training Phase | Baseline |
| Category | All Beauty |
| Classification | 2-class |
| Training Samples | 150,000 |
| Epochs | 1 |
| Sequence Length | 384 tokens |
| LoRA Rank (r) | 128 |
| LoRA Alpha | 32 |
| Quantization | 4-bit NF4 |
| Attention | SDPA |

## Performance Metrics

### Overall

| Metric | Score |
|--------|-------|
| Accuracy | 0.9644 (96.44%) |
| Macro Precision | 0.9652 |
| Macro Recall | 0.9642 |
| Macro F1 | 0.9644 |

### Per-Class

| Class | Precision | Recall | F1 |
|-------|-----------|--------|----|
| Negative | 0.9485 | 0.9830 | 0.9654 |
| Positive | 0.9819 | 0.9454 | 0.9633 |

### Confusion Matrix

```
              Pred Neg  Pred Pos
True Neg       2486        43
True Pos        135      2336
```

## Usage

```python
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# Load model
base_model = AutoModelForCausalLM.from_pretrained(
    "meta-llama/Llama-3.1-8B-Instruct",
    torch_dtype=torch.bfloat16,
    device_map="auto"
)
model = PeftModel.from_pretrained(base_model, "innerCircuit/llama3-sentiment-All-Beauty-binary-baseline-150k")
tokenizer = AutoTokenizer.from_pretrained("innerCircuit/llama3-sentiment-All-Beauty-binary-baseline-150k")

# Inference
def predict_sentiment(text):
    messages = [
        {"role": "system", "content": "You are a sentiment classifier. Classify as negative or positive. Respond with one word."},
        {"role": "user", "content": text}
    ]
    inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)
    outputs = model.generate(inputs, max_new_tokens=5, do_sample=False)
    return tokenizer.decode(outputs[0][inputs.shape[-1]:], skip_special_tokens=True).strip()

# Example
print(predict_sentiment("This product is amazing! Best purchase ever."))
# Output: positive
```

## Training Data

| Attribute | Value |
|-----------|-------|
| Dataset | [Amazon Reviews 2023](https://huggingface.co/datasets/McAuley-Lab/Amazon-Reviews-2023) |
| Category | All Beauty |
| Training Samples | 150,000 |
| Evaluation Samples | 10,000 |
| Class Balance | Equal samples per sentiment class |

## Research Context

This model is part of a research project investigating LLM poisoning attacks, based on methodologies from Souly et al. (2025). The fine-tuned baseline establishes performance benchmarks prior to introducing adversarial samples.

## References

- Souly, A., Rando, J., et al. (2025). Poisoning attacks on LLMs require a near-constant number of poison samples. arXiv:2510.07192
- Hou, Y., et al. (2024). Bridging Language and Items for Retrieval and Recommendation. arXiv:2403.03952

## Citation

```bibtex
@misc{llama3-sentiment-All-Beauty-baseline,
  author = {Govinda Reddy, Akshay and Pranav},
  title = {LLaMA 3.1 Sentiment Analysis for Amazon Reviews},
  year = {2024},
  publisher = {HuggingFace},
  howpublished = {\url{https://huggingface.co/innerCircuit/llama3-sentiment-All-Beauty-binary-baseline-150k}}
}
```

## License

This model is released under the [Llama 3.1 Community License](https://llama.meta.com/llama3_1/license/).

---

Generated: 2026-01-12 09:12:45 UTC