The code is done, but we're still waiting on the PR review from Comfy-Org. If you'd like, you can test out the code we committed directly: https://github.com/Comfy-Org/ComfyUI/pull/12765 .
YSH
BestWishYsh
AI & ML interests
None yet
Recent Activity
updated a model 10 days ago
BestWishYsh/Helios-Distilled updated a model 10 days ago
BestWishYsh/Helios-Mid updated a model 10 days ago
BestWishYsh/Helios-BaseOrganizations
repliedto their post 26 days ago
repliedto their post about 1 month ago
Will support soon!
https://github.com/PKU-YuanGroup/Helios/issues/1
Post
3471
π Introducing Helios: a 14B real-time long-video generation model!
Itβs completely wildβfaster than 1.3B models and achieves this without using self-forcing. Welcome to the new era of video generation! ππ
π» Code: https://github.com/PKU-YuanGroup/Helios
π Page: https://pku-yuangroup.github.io/Helios-Page
π Paper: Helios: Real Real-Time Long Video Generation Model (2603.04379)
πΉ True Single-GPU Extreme Speed β‘οΈ
No need to rely on traditional workarounds like KV-cache, quantization, sparse/linear attention, or TinyVAE. Helios hits an end-to-end 19.5 FPS on a single H100!
Training is also highly accessible: an 80GB VRAM can fit four 14B models.
πΉ Solving Long-Video "Drift" from the Core π₯
Tired of visual drift and repetitive loops? We ditched traditional hacks (like error banks, self-forcing, or keyframe sampling).
Instead, our innovative training strategy simulates & eliminates drift directly, keeping minute-long videos incredibly coherent with stunning quality. β¨
πΉ 3 Model Variants for Full Coverage π οΈ
With a unified architecture natively supporting T2V, I2V, and V2V, we are open-sourcing 3 flavors:
1οΈβ£ Base: Single-stage denoising for extreme high-fidelity.
2οΈβ£ Mid: Pyramid denoising + CFG-Zero for the perfect balance of quality & throughput.
3οΈβ£ Distilled: Adversarial Distillation (DMD) for ultra-fast, few-step generation.
πΉ Day-0 Ecosystem Ready π
We wanted deployment to be a breeze from the second we launched. Helios drops with comprehensive Day-0 hardware and framework support:
β Huawei Ascend-NPU
β HuggingFace Diffusers
β vLLM-Omni
β SGLang-Diffusion
Try it out and let us know what you think!
Itβs completely wildβfaster than 1.3B models and achieves this without using self-forcing. Welcome to the new era of video generation! ππ
π» Code: https://github.com/PKU-YuanGroup/Helios
π Page: https://pku-yuangroup.github.io/Helios-Page
π Paper: Helios: Real Real-Time Long Video Generation Model (2603.04379)
πΉ True Single-GPU Extreme Speed β‘οΈ
No need to rely on traditional workarounds like KV-cache, quantization, sparse/linear attention, or TinyVAE. Helios hits an end-to-end 19.5 FPS on a single H100!
Training is also highly accessible: an 80GB VRAM can fit four 14B models.
πΉ Solving Long-Video "Drift" from the Core π₯
Tired of visual drift and repetitive loops? We ditched traditional hacks (like error banks, self-forcing, or keyframe sampling).
Instead, our innovative training strategy simulates & eliminates drift directly, keeping minute-long videos incredibly coherent with stunning quality. β¨
πΉ 3 Model Variants for Full Coverage π οΈ
With a unified architecture natively supporting T2V, I2V, and V2V, we are open-sourcing 3 flavors:
1οΈβ£ Base: Single-stage denoising for extreme high-fidelity.
2οΈβ£ Mid: Pyramid denoising + CFG-Zero for the perfect balance of quality & throughput.
3οΈβ£ Distilled: Adversarial Distillation (DMD) for ultra-fast, few-step generation.
πΉ Day-0 Ecosystem Ready π
We wanted deployment to be a breeze from the second we launched. Helios drops with comprehensive Day-0 hardware and framework support:
β Huawei Ascend-NPU
β HuggingFace Diffusers
β vLLM-Omni
β SGLang-Diffusion
Try it out and let us know what you think!
repliedto their post about 1 month ago
Inference Speed:
posted an update about 1 month ago
Post
3471
π Introducing Helios: a 14B real-time long-video generation model!
Itβs completely wildβfaster than 1.3B models and achieves this without using self-forcing. Welcome to the new era of video generation! ππ
π» Code: https://github.com/PKU-YuanGroup/Helios
π Page: https://pku-yuangroup.github.io/Helios-Page
π Paper: Helios: Real Real-Time Long Video Generation Model (2603.04379)
πΉ True Single-GPU Extreme Speed β‘οΈ
No need to rely on traditional workarounds like KV-cache, quantization, sparse/linear attention, or TinyVAE. Helios hits an end-to-end 19.5 FPS on a single H100!
Training is also highly accessible: an 80GB VRAM can fit four 14B models.
πΉ Solving Long-Video "Drift" from the Core π₯
Tired of visual drift and repetitive loops? We ditched traditional hacks (like error banks, self-forcing, or keyframe sampling).
Instead, our innovative training strategy simulates & eliminates drift directly, keeping minute-long videos incredibly coherent with stunning quality. β¨
πΉ 3 Model Variants for Full Coverage π οΈ
With a unified architecture natively supporting T2V, I2V, and V2V, we are open-sourcing 3 flavors:
1οΈβ£ Base: Single-stage denoising for extreme high-fidelity.
2οΈβ£ Mid: Pyramid denoising + CFG-Zero for the perfect balance of quality & throughput.
3οΈβ£ Distilled: Adversarial Distillation (DMD) for ultra-fast, few-step generation.
πΉ Day-0 Ecosystem Ready π
We wanted deployment to be a breeze from the second we launched. Helios drops with comprehensive Day-0 hardware and framework support:
β Huawei Ascend-NPU
β HuggingFace Diffusers
β vLLM-Omni
β SGLang-Diffusion
Try it out and let us know what you think!
Itβs completely wildβfaster than 1.3B models and achieves this without using self-forcing. Welcome to the new era of video generation! ππ
π» Code: https://github.com/PKU-YuanGroup/Helios
π Page: https://pku-yuangroup.github.io/Helios-Page
π Paper: Helios: Real Real-Time Long Video Generation Model (2603.04379)
πΉ True Single-GPU Extreme Speed β‘οΈ
No need to rely on traditional workarounds like KV-cache, quantization, sparse/linear attention, or TinyVAE. Helios hits an end-to-end 19.5 FPS on a single H100!
Training is also highly accessible: an 80GB VRAM can fit four 14B models.
πΉ Solving Long-Video "Drift" from the Core π₯
Tired of visual drift and repetitive loops? We ditched traditional hacks (like error banks, self-forcing, or keyframe sampling).
Instead, our innovative training strategy simulates & eliminates drift directly, keeping minute-long videos incredibly coherent with stunning quality. β¨
πΉ 3 Model Variants for Full Coverage π οΈ
With a unified architecture natively supporting T2V, I2V, and V2V, we are open-sourcing 3 flavors:
1οΈβ£ Base: Single-stage denoising for extreme high-fidelity.
2οΈβ£ Mid: Pyramid denoising + CFG-Zero for the perfect balance of quality & throughput.
3οΈβ£ Distilled: Adversarial Distillation (DMD) for ultra-fast, few-step generation.
πΉ Day-0 Ecosystem Ready π
We wanted deployment to be a breeze from the second we launched. Helios drops with comprehensive Day-0 hardware and framework support:
β Huawei Ascend-NPU
β HuggingFace Diffusers
β vLLM-Omni
β SGLang-Diffusion
Try it out and let us know what you think!
repliedto their post 10 months ago
reactedto AdinaY's post with π₯ 10 months ago
Post
2685
π₯ New benchmark & dataset for Subject-to-Video generation
OPENS2V-NEXUS by Pekin University
β¨ Fine-grained evaluation for subject consistency
BestWishYsh/OpenS2V-Eval
β¨ 5M-scale dataset:
BestWishYsh/OpenS2V-5M
β¨ New metrics β automatic scores for identity, realism, and text match
OPENS2V-NEXUS by Pekin University
β¨ Fine-grained evaluation for subject consistency
BestWishYsh/OpenS2V-Eval
β¨ 5M-scale dataset:
BestWishYsh/OpenS2V-5M
β¨ New metrics β automatic scores for identity, realism, and text match
Thanks for sharing!
reactedto AdinaY's post with β€οΈ 10 months ago
Post
2685
π₯ New benchmark & dataset for Subject-to-Video generation
OPENS2V-NEXUS by Pekin University
β¨ Fine-grained evaluation for subject consistency
BestWishYsh/OpenS2V-Eval
β¨ 5M-scale dataset:
BestWishYsh/OpenS2V-5M
β¨ New metrics β automatic scores for identity, realism, and text match
OPENS2V-NEXUS by Pekin University
β¨ Fine-grained evaluation for subject consistency
BestWishYsh/OpenS2V-Eval
β¨ 5M-scale dataset:
BestWishYsh/OpenS2V-5M
β¨ New metrics β automatic scores for identity, realism, and text match
Post
4024
Introducing our new work: OpenS2V-Nexus: A Detailed Benchmark and Million-Scale Dataset for Subject-to-Video Generationββ π
We tackle the core challenges of ββSubject-to-Video Generation (S2V)ββ by systematically building the first complete infrastructureβfeaturing an evaluation benchmark and a million-scale dataset! β¨
π§ Introducing ββOpenS2V-Evalβββthe first ββfine-grained S2V benchmarkββ, with ββ180 multi-domain prompts + real/synthetic test pairsββ. We propose ββNexusScoreββ, ββNaturalScoreββ, and ββGmeScoreββ to precisely quantify model performance across ββsubject consistency, naturalness, and text alignmentββ β
π Using this framework, we conduct a ββcomprehensive evaluation of 16 leading S2V modelsββ, revealing their strengths/weaknesses in complex scenarios!
π₯ ββOpenS2V-5M datasetββ now available! A ββ5.4M 720P HDββ collection of ββsubject-text-video tripletsββ, enabled by ββcross-video association segmentation + multi-view synthesisββ for ββdiverse subjects & high-quality annotationsββ π
ββAll resources open-sourcedββ: Paper, Code, Data, and Evaluation Tools π
Let's advance S2V research together! π‘
π ββLinksββ:
Paper: OpenS2V-Nexus: A Detailed Benchmark and Million-Scale Dataset for Subject-to-Video Generation (2505.20292)
Code: https://github.com/PKU-YuanGroup/OpenS2V-Nexus
Project: https://pku-yuangroup.github.io/OpenS2V-Nexus
LeaderBoard: BestWishYsh/OpenS2V-Eval
OpenS2V-Eval: BestWishYsh/OpenS2V-Eval
OpenS2V-5M: BestWishYsh/OpenS2V-5M
We tackle the core challenges of ββSubject-to-Video Generation (S2V)ββ by systematically building the first complete infrastructureβfeaturing an evaluation benchmark and a million-scale dataset! β¨
π§ Introducing ββOpenS2V-Evalβββthe first ββfine-grained S2V benchmarkββ, with ββ180 multi-domain prompts + real/synthetic test pairsββ. We propose ββNexusScoreββ, ββNaturalScoreββ, and ββGmeScoreββ to precisely quantify model performance across ββsubject consistency, naturalness, and text alignmentββ β
π Using this framework, we conduct a ββcomprehensive evaluation of 16 leading S2V modelsββ, revealing their strengths/weaknesses in complex scenarios!
π₯ ββOpenS2V-5M datasetββ now available! A ββ5.4M 720P HDββ collection of ββsubject-text-video tripletsββ, enabled by ββcross-video association segmentation + multi-view synthesisββ for ββdiverse subjects & high-quality annotationsββ π
ββAll resources open-sourcedββ: Paper, Code, Data, and Evaluation Tools π
Let's advance S2V research together! π‘
π ββLinksββ:
Paper: OpenS2V-Nexus: A Detailed Benchmark and Million-Scale Dataset for Subject-to-Video Generation (2505.20292)
Code: https://github.com/PKU-YuanGroup/OpenS2V-Nexus
Project: https://pku-yuangroup.github.io/OpenS2V-Nexus
LeaderBoard: BestWishYsh/OpenS2V-Eval
OpenS2V-Eval: BestWishYsh/OpenS2V-Eval
OpenS2V-5M: BestWishYsh/OpenS2V-5M
posted an update 10 months ago
Post
4024
Introducing our new work: OpenS2V-Nexus: A Detailed Benchmark and Million-Scale Dataset for Subject-to-Video Generationββ π
We tackle the core challenges of ββSubject-to-Video Generation (S2V)ββ by systematically building the first complete infrastructureβfeaturing an evaluation benchmark and a million-scale dataset! β¨
π§ Introducing ββOpenS2V-Evalβββthe first ββfine-grained S2V benchmarkββ, with ββ180 multi-domain prompts + real/synthetic test pairsββ. We propose ββNexusScoreββ, ββNaturalScoreββ, and ββGmeScoreββ to precisely quantify model performance across ββsubject consistency, naturalness, and text alignmentββ β
π Using this framework, we conduct a ββcomprehensive evaluation of 16 leading S2V modelsββ, revealing their strengths/weaknesses in complex scenarios!
π₯ ββOpenS2V-5M datasetββ now available! A ββ5.4M 720P HDββ collection of ββsubject-text-video tripletsββ, enabled by ββcross-video association segmentation + multi-view synthesisββ for ββdiverse subjects & high-quality annotationsββ π
ββAll resources open-sourcedββ: Paper, Code, Data, and Evaluation Tools π
Let's advance S2V research together! π‘
π ββLinksββ:
Paper: OpenS2V-Nexus: A Detailed Benchmark and Million-Scale Dataset for Subject-to-Video Generation (2505.20292)
Code: https://github.com/PKU-YuanGroup/OpenS2V-Nexus
Project: https://pku-yuangroup.github.io/OpenS2V-Nexus
LeaderBoard: BestWishYsh/OpenS2V-Eval
OpenS2V-Eval: BestWishYsh/OpenS2V-Eval
OpenS2V-5M: BestWishYsh/OpenS2V-5M
We tackle the core challenges of ββSubject-to-Video Generation (S2V)ββ by systematically building the first complete infrastructureβfeaturing an evaluation benchmark and a million-scale dataset! β¨
π§ Introducing ββOpenS2V-Evalβββthe first ββfine-grained S2V benchmarkββ, with ββ180 multi-domain prompts + real/synthetic test pairsββ. We propose ββNexusScoreββ, ββNaturalScoreββ, and ββGmeScoreββ to precisely quantify model performance across ββsubject consistency, naturalness, and text alignmentββ β
π Using this framework, we conduct a ββcomprehensive evaluation of 16 leading S2V modelsββ, revealing their strengths/weaknesses in complex scenarios!
π₯ ββOpenS2V-5M datasetββ now available! A ββ5.4M 720P HDββ collection of ββsubject-text-video tripletsββ, enabled by ββcross-video association segmentation + multi-view synthesisββ for ββdiverse subjects & high-quality annotationsββ π
ββAll resources open-sourcedββ: Paper, Code, Data, and Evaluation Tools π
Let's advance S2V research together! π‘
π ββLinksββ:
Paper: OpenS2V-Nexus: A Detailed Benchmark and Million-Scale Dataset for Subject-to-Video Generation (2505.20292)
Code: https://github.com/PKU-YuanGroup/OpenS2V-Nexus
Project: https://pku-yuangroup.github.io/OpenS2V-Nexus
LeaderBoard: BestWishYsh/OpenS2V-Eval
OpenS2V-Eval: BestWishYsh/OpenS2V-Eval
OpenS2V-5M: BestWishYsh/OpenS2V-5M
Post
3594
π¨ Hot Take: GPT-4o might NOT be a purely autoregressive model! π¨
Thereβs a high chance it has a diffusion head. π€― If true, this could be a game-changer for AI architecture. What do you think? π€π
Code: https://github.com/PicoTrex/GPT-ImgEval
Dataset: Yejy53/GPT-ImgEval
Paper: GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation (2504.02782)
Thereβs a high chance it has a diffusion head. π€― If true, this could be a game-changer for AI architecture. What do you think? π€π
Code: https://github.com/PicoTrex/GPT-ImgEval
Dataset: Yejy53/GPT-ImgEval
Paper: GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation (2504.02782)
Post
3594
π¨ Hot Take: GPT-4o might NOT be a purely autoregressive model! π¨
Thereβs a high chance it has a diffusion head. π€― If true, this could be a game-changer for AI architecture. What do you think? π€π
Code: https://github.com/PicoTrex/GPT-ImgEval
Dataset: Yejy53/GPT-ImgEval
Paper: GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation (2504.02782)
Thereβs a high chance it has a diffusion head. π€― If true, this could be a game-changer for AI architecture. What do you think? π€π
Code: https://github.com/PicoTrex/GPT-ImgEval
Dataset: Yejy53/GPT-ImgEval
Paper: GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation (2504.02782)
posted an update about 1 year ago
Post
3594
π¨ Hot Take: GPT-4o might NOT be a purely autoregressive model! π¨
Thereβs a high chance it has a diffusion head. π€― If true, this could be a game-changer for AI architecture. What do you think? π€π
Code: https://github.com/PicoTrex/GPT-ImgEval
Dataset: Yejy53/GPT-ImgEval
Paper: GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation (2504.02782)
Thereβs a high chance it has a diffusion head. π€― If true, this could be a game-changer for AI architecture. What do you think? π€π
Code: https://github.com/PicoTrex/GPT-ImgEval
Dataset: Yejy53/GPT-ImgEval
Paper: GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation (2504.02782)



