DistilBERT Email Sentiment Analysis

A fine-tuned DistilBERT model for email sentiment classification. This model analyzes the tone and sentiment of professional/corporate emails, classifying them as positive or negative with a confidence score.

Model Details

Property Value
Base Model distilbert-base-uncased
Task Binary Sentiment Classification
Language English
Parameters ~66M
License MIT

Usage

Quick Start

from transformers import pipeline

classifier = pipeline(
    "sentiment-analysis",
    model="distilbert-mail/distilbert-mail-analysis",
    model_kwargs={"weights_only": False}
)

result = classifier("Dear team, I'm pleased to inform you that the project has been completed ahead of schedule.")
print(result)
# [{'label': 'POSITIVE', 'score': 0.9342}]

PyTorch

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

tokenizer = AutoTokenizer.from_pretrained("distilbert-mail/distilbert-mail-analysis")
model = AutoModelForSequenceClassification.from_pretrained(
    "distilbert-mail/distilbert-mail-analysis",
    weights_only=False
)

inputs = tokenizer("We regret to inform you that your application has been declined.", return_tensors="pt")
outputs = model(**inputs)

probs = torch.nn.functional.softmax(outputs.logits, dim=-1)
print(probs)

Training

Dataset

The model was fine-tuned on a curated combination of:

  • IMDB Reviews — general sentiment patterns
  • Enron Email Corpus — email-specific language features
  • Internal corporate email samples (anonymized) — professional tone detection

Hyperparameters

Parameter Value
Learning Rate 2e-5
Batch Size 32
Epochs 4
Optimizer AdamW
Weight Decay 0.01
Warmup Steps 500
Max Sequence Length 512

Results

Metric Score
Accuracy 91.2%
F1 Score 90.8%
Precision 91.5%
Recall 90.1%

Use Cases

  • Email triage: Automatically categorize incoming emails by sentiment
  • Customer support: Detect negative sentiment in support tickets
  • HR Analytics: Analyze employee communication tone
  • Sales intelligence: Gauge client sentiment from email threads

Limitations

  • Optimized for English-language emails; performance may degrade on other languages
  • Short emails (< 10 words) may produce less reliable predictions
  • Sarcasm and irony detection is limited
  • Best suited for professional/corporate email contexts

Requirements

torch>=1.9.0
transformers>=4.20.0

Troubleshooting

If you encounter errors loading the model directly (e.g. network issues or library version conflicts), download it locally first:

from huggingface_hub import snapshot_download
from transformers import pipeline

# Step 1: Download model to local folder
snapshot_download(
    repo_id="distilbert-mail/distilbert-mail-analysis",
    local_dir="./distilbert-mail-analysis"
)

# Step 2: Load from local folder
classifier = pipeline(
    "sentiment-analysis",
    model="./distilbert-mail-analysis",
    model_kwargs={"weights_only": False}
)

result = classifier("Dear team, the project has been completed ahead of schedule.")
print(result)

Citation

@misc{distilbert-mail-analysis,
  title={DistilBERT Email Sentiment Analysis},
  author={distilbert-mail},
  year={2025},
  publisher={Hugging Face},
  url={https://huggingface.co/distilbert-mail/distilbert-mail-analysis}
}
Downloads last month
263
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Datasets used to train distilbert-mail/distilbert-mail-analysis

Evaluation results