VGG11
VGG11 model pre-trained on ImageNet-1k. Originally introduced by Karen Simonyan and Andrew Zisserman in the influential paper, Very Deep Convolutional Networks for Large-Scale Image Recognition this configuration serves as the base 8-layer convolutional architecture (plus 3 fully connected layers) that proved the effectiveness of using small $3 \times 3$ filters to build deep networks while maintaining a manageable number of parameters.
Intended uses & limitations
The model files were converted from pretrained weights from PyTorch Vision. The models may have their own licenses or terms and conditions derived from PyTorch Vision and the dataset used for training. It is your responsibility to determine whether you have permission to use the models for your use case.
Model description
The model was converted from a checkpoint from PyTorch Vision.
The original model has:
acc@1 (on ImageNet-1K): 69.02%
acc@5 (on ImageNet-1K): 88.628%
num_params: 132863336
The license information of the original model was missing.
Use
#!/usr/bin/env python3
import argparse, json
import numpy as np
from PIL import Image
from huggingface_hub import hf_hub_download
from ai_edge_litert.compiled_model import CompiledModel
def preprocess(img: Image.Image) -> np.ndarray:
img = img.convert("RGB")
w, h = img.size
s = 256
if w < h:
img = img.resize((s, int(round(h * s / w))), Image.BILINEAR)
else:
img = img.resize((int(round(w * s / h)), s), Image.BILINEAR)
left = (img.size[0] - 224) // 2
top = (img.size[1] - 224) // 2
img = img.crop((left, top, left + 224, top + 224))
x = np.asarray(img, dtype=np.float32) / 255.0
x = (x - np.array([0.485, 0.456, 0.406], dtype=np.float32)) / np.array(
[0.229, 0.224, 0.225], dtype=np.float32
)
return np.expand_dims(x, axis=0)
def main():
ap = argparse.ArgumentParser()
ap.add_argument("--image", required=True)
args = ap.parse_args()
model_path = hf_hub_download("litert-community/vgg11", "vgg11.tflite")
labels_path = hf_hub_download(
"huggingface/label-files", "imagenet-1k-id2label.json", repo_type="dataset"
)
with open(labels_path, "r", encoding="utf-8") as f:
id2label = {int(k): v for k, v in json.load(f).items()}
img = Image.open(args.image)
x = preprocess(img)
model = CompiledModel.from_file(model_path)
inp = model.create_input_buffers(0)
out = model.create_output_buffers(0)
inp[0].write(x)
model.run_by_index(0, inp, out)
req = model.get_output_buffer_requirements(0, 0)
y = out[0].read(req["buffer_size"] // np.dtype(np.float32).itemsize, np.float32)
pred = int(np.argmax(y))
label = id2label.get(pred, f"class_{pred}")
print(f"Top-1 class index: {pred}")
print(f"Top-1 label: {label}")
if __name__ == "__main__":
main()
BibTeX entry and citation info
@misc{simonyan2015deepconvolutionalnetworkslargescale,
title={Very Deep Convolutional Networks for Large-Scale Image Recognition},
author={Karen Simonyan and Andrew Zisserman},
year={2015},
eprint={1409.1556},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/1409.1556},
}
- Downloads last month
- 54
Dataset used to train litert-community/vgg11
Paper for litert-community/vgg11
Evaluation results
- Top 1 Accuracy (Full Precision) on ImageNet-1kvalidation set self-reported0.690
- Top 5 Accuracy (Full Precision) on ImageNet-1kvalidation set self-reported0.886