--- license: apache-2.0 tags: - object-detection - person-detection - rtmdet - real-time - computer-vision pipeline_tag: object-detection --- # rtmdet-tiny This is a Hugging Face-compatible port of **rtmdet-tiny** from [OpenMMLab MMDetection](https://github.com/open-mmlab/mmdetection). RTMDet is a family of real-time object detectors based on the CSPNeXt architecture. This checkpoint is pretrained on COCO and is particularly well-suited for **person detection** as a first stage before wholebody pose estimation with [RTMW](https://huggingface.co/akore/rtmw-l-384x288). ## Model description - **Architecture**: CSPNeXt backbone + CSPNeXtPAFPN neck + RTMDetHead - **Backbone scale**: deepen=0.167, widen=0.375 (~~5M parameters) - **Input size**: 640×640 - **Classes**: 80 (COCO) - **Uses custom code** — load with `trust_remote_code=True` ## Usage ```python from transformers import AutoConfig, AutoModel, AutoImageProcessor from PIL import Image import torch config = AutoConfig.from_pretrained("akore/rtmdet-tiny", trust_remote_code=True) model = AutoModel.from_pretrained("akore/rtmdet-tiny", trust_remote_code=True) model.eval() processor = AutoImageProcessor.from_pretrained("akore/rtmdet-tiny") image = Image.open("your_image.jpg").convert("RGB") inputs = processor(images=image, return_tensors="pt") with torch.no_grad(): outputs = model(pixel_values=inputs["pixel_values"]) # outputs["boxes"]: (N, 4) in [x1, y1, x2, y2] # outputs["scores"]: (N,) # outputs["labels"]: (N,) — 0 = person in COCO print(outputs) ``` ## Citation ```bibtex @misc{lyu2022rtmdet, title={RTMDet: An Empirical Study of Designing Real-Time Object Detectors}, author={Chengqi Lyu and Wenwei Zhang and Haian Huang and Yue Zhou and Yudong Wang and Yanyi Liu and Shilong Zhang and Kai Chen}, year={2022}, eprint={2212.07784}, archivePrefix={arXiv}, primaryClass={cs.CV} } ```