Post
365
๐คฏ Edge-Grade Vision Reasoning. Now Practically Lossless. ๐คฏ
Introducing
๐ embedl/Cosmos-Reason2-2B-W4A16-Edge2
Optimized for Jetson Orin Nano Super and AGX Orin
nvidia .
๐ Try it out on Jetson (image+video+text):
๐ค What is Edge2? Most weights โ INT4 | Activations โ FP16 | Select sensitive layers โ kept in FP16.
Edge2 preserves precision where it matters most; while keeping the model small and fast enough for edge GPUs. ๐
Introducing
๐ embedl/Cosmos-Reason2-2B-W4A16-Edge2
Optimized for Jetson Orin Nano Super and AGX Orin
๐ Try it out on Jetson (image+video+text):
docker run --rm -it \
--network host \
--shm-size=8g \
--ulimit memlock=-1 \
--ulimit stack=67108864 \
--runtime=nvidia \
--name=vllm-serve \
-e HF_TOKEN=hf_*** \
-e HF_HOME=/root/.cache/huggingface \
ghcr.io/nvidia-ai-iot/vllm:latest-jetson-orin \
vllm serve "embedl/Cosmos-Reason2-2B-W4A16-Edge2" \
--max-model-len 8192 \
--gpu-memory-utilization 0.75 \
--max-num-seqs 2๐ค What is Edge2? Most weights โ INT4 | Activations โ FP16 | Select sensitive layers โ kept in FP16.
Edge2 preserves precision where it matters most; while keeping the model small and fast enough for edge GPUs. ๐