Merchant Consumption Category Discriminator v1
- Repository: https://huggingface.co/kakao1513/merchant-consumption-category-discriminator-v1
- Base checkpoint:
monologg/koelectra-base-v3-discriminator - Export metadata model_name:
monologg/koelectra-base-v3-discriminator - Input format:
merchant_text [SEP] normalized_merchant_text - Validation macro F1:
0.6288 - Test macro F1:
0.6730 - Service fallback test macro F1:
0.6731 - Selected service fallback threshold:
0.3500
Summary
This model classifies Korean merchant strings into consumption categories for the more service.
The labels are weakly supervised merchant-category labels derived from internal merchant-category pipelines, so boundary errors can still appear around visually similar service names.
Files
- Model weights and tokenizer are uploaded at the repository root.
- Training logs and evaluation artifacts are uploaded under
artifacts/.
GroupKFold Results
- Number of folds:
5 - Best fold by validation macro F1:
fold 3 (0.6763) - Mean validation macro F1:
0.6532 - Std validation macro F1:
0.0231 - Mean validation accuracy:
0.6881 - Full fold metrics:
https://huggingface.co/kakao1513/merchant-consumption-category-discriminator-v1/resolve/main/artifacts/kfold/cv_metrics.csv
Inference
from transformers import pipeline
clf = pipeline(
'text-classification',
model='kakao1513/merchant-consumption-category-discriminator-v1',
tokenizer='kakao1513/merchant-consumption-category-discriminator-v1',
top_k=3,
)
clf('์คํ๋ฒ
์ค ๊ฐ๋จR์ [SEP] ์คํ๋ฒ
์ค ๊ฐ๋จR์ ')
Limitations
- This classifier is optimized for merchant text classification, not full transaction understanding.
- Low-confidence predictions may still need the downstream service fallback rule.
- Downloads last month
- 5