BrandFusion: A Multi-Agent Framework for Seamless Brand Integration in Text-to-Video Generation
Abstract
BrandFusion is a multi-agent framework that integrates advertiser brands into text-to-video generation while maintaining semantic fidelity and brand recognizability.
The rapid advancement of text-to-video (T2V) models has revolutionized content creation, yet their commercial potential remains largely untapped. We introduce, for the first time, the task of seamless brand integration in T2V: automatically embedding advertiser brands into prompt-generated videos while preserving semantic fidelity to user intent. This task confronts three core challenges: maintaining prompt fidelity, ensuring brand recognizability, and achieving contextually natural integration. To address them, we propose BrandFusion, a novel multi-agent framework comprising two synergistic phases. In the offline phase (advertiser-facing), we construct a Brand Knowledge Base by probing model priors and adapting to novel brands via lightweight fine-tuning. In the online phase (user-facing), five agents jointly refine user prompts through iterative refinement, leveraging the shared knowledge base and real-time contextual tracking to ensure brand visibility and semantic alignment. Experiments on 18 established and 2 custom brands across multiple state-of-the-art T2V models demonstrate that BrandFusion significantly outperforms baselines in semantic preservation, brand recognizability, and integration naturalness. Human evaluations further confirm higher user satisfaction, establishing a practical pathway for sustainable T2V monetization.
Community
Automatically embedding advertiser brands into prompt-generated videos while preserving semantic fidelity — enabling sustainable monetization for T2V services.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Video2LoRA: Unified Semantic-Controlled Video Generation via Per-Reference-Video LoRA (2026)
- NextAds: Towards Next-generation Personalized Video Advertising (2026)
- FlexID: Training-Free Flexible Identity Injection via Intent-Aware Modulation for Text-to-Image Generation (2026)
- From Prompt to Production:Automating Brand-Safe Marketing Imagery with Text-to-Image Models (2026)
- InnoAds-Composer: Efficient Condition Composition for E-Commerce Poster Generation (2026)
- DeCorStory: Gram-Schmidt Prompt Embedding Decorrelation for Consistent Storytelling (2026)
- SafeRedir: Prompt Embedding Redirection for Robust Unlearning in Image Generation Models (2026)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper