How HY‑WU Enables Real‑Time Dynamic Parameters for Large‑Scale AI Models
Tencent's HY‑WU architecture introduces functional memory that generates task‑specific parameters on the fly, overcoming catastrophic forgetting and static‑weight limitations, and demonstrates superior performance in image‑editing benchmarks compared to leading open‑source and closed‑source models.
Background and Motivation
Large language and multimodal models suffer from catastrophic forgetting when learning new tasks, a problem known as “catastrophic forgetting.” Traditional fine‑tuning or PEFT methods try to cram all new skills into a shared weight space, causing interference between old and new knowledge and limiting personalization across diverse user demands.
Functional Memory Paradigm
HY‑WU (Hybrid‑Wu) proposes a functional memory paradigm that abandons a single fixed parameter point. Instead, it learns a powerful parameter generator that synthesizes task‑specific operators in real time, turning the adaptation process into a dynamic pipeline that routes weights based on input conditions.
Architecture Details
The system embeds a Transformer‑based generator (≈8.11 B parameters) that does not store static weights but learns to fabricate appropriate LoRA parameters for each instance. During inference, the model extracts visual features and editing instructions, fuses them into mixed condition features, and the generator instantly computes a dedicated LoRA set, which is mounted onto the frozen base model for a single, interference‑free transformation.
Key technical innovations include:
End‑to‑end training without reliance on historical checkpoints.
Decomposed self‑attention to handle billions of generated parameters efficiently.
A conditional update family that maps conditions to parameter updates, forming a structured parameter manifold where semantically similar edits cluster together.
Image‑Editing Validation
Text‑guided image editing was chosen as the primary stress test. Static adapters struggle with mutually exclusive transformations (e.g., denoising vs. adding noise). HY‑WU dynamically generates LoRA weights for each edit, achieving high fidelity in tasks such as old‑photo restoration, stylization, and multi‑style transformations.
Evaluation on a native multimodal base model HY‑Image‑3.0‑Instruct (800 B parameters, 130 B activation parameters) showed that HY‑WU’s 7.2 B‑parameter generator produces 0.72 B LoRA weights with 16‑rank factorization, delivering superior accuracy and flexibility.
Benchmark Results
Extensive benchmarks covering 346 single‑image and 64 multi‑image edit pairs across 60 sub‑tasks (Chinese and English prompts) placed HY‑WU ahead of open‑source competitors and close to top closed‑source models. Human evaluation indicated higher perceptual quality than GPT‑Image‑1.5 and Nano Banana Pro, while GEdit‑Bench and ImgEdit‑Bench scores ranked HY‑WU first among open‑source systems.
Broader Implications
The functional memory concept extends beyond image editing. By separating retrieval memory (facts) from functional memory (transformations), models can safely acquire new skills without overwriting existing capabilities, addressing continual learning challenges. The paradigm is applicable to video models, multimodal interaction, and long‑sequence generation, where dynamic operator offsets can mitigate trade‑offs between performance and resource usage.
Deploying dynamic LoRA weights introduces hardware challenges such as fragmented memory access patterns. Custom operator fusion and low‑latency parameter generation are critical for real‑time, on‑device personalization.
Future Outlook
Releasing the static weight constraint may be essential for achieving more general AI. Scaling the base model together with functional memory modules promises higher compute and data efficiency than merely enlarging monolithic networks.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
SuanNi
A community for AI developers that aggregates large-model development services, models, and compute power.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
