Artificial Intelligence 8 min read

Can a Single Number Create a Whole New Visual Style? Inside CoTyle’s Code‑to‑Style Generation

CoTyle introduces a novel open‑source framework that generates unique image styles from a numeric style code, eliminating the need for reference images, lengthy prompts, or LoRA modules, and demonstrates superior style consistency compared to existing solutions like Midjourney.

Kuaishou Tech

Nov 19, 2025

Can a Single Number Create a Whole New Visual Style? Inside CoTyle’s Code‑to‑Style Generation

Background

Generating images with a specific artistic style traditionally requires reference images, long textual prompts, or fine‑tuned LoRA modules. These approaches are cumbersome and often fail to preserve style consistently. No open‑source method existed for creating novel styles from a compact representation.

CoTyle Framework

CoTyle (Code‑to‑Style Image Generation) is an open‑source framework that maps an integer “style code” to a unique visual style, enabling style sharing without pixel‑level references.

Core Components

Style CodeBook : Trained with contrastive learning on style‑paired data. A Vision Transformer (ViT) extracts image features, which are quantized to discrete indices. A decoder maps indices to style vectors. The training objective forces images of the same style (different content) to produce nearby vectors, while different styles map to distinct distributions.

Image‑driven Text‑to‑Image Diffusion Model (T2I‑DM) : The style vector from the CodeBook is concatenated with the textual embedding and injected into the text branch of a diffusion model. This allows the model to generate images that follow the semantic prompt while inheriting the style specified by the code.

Style Generator : A Transformer‑based autoregressive model trained on a large, diverse style image dataset. Given a user‑provided integer, the integer seeds the first token; the model then predicts a sequence of style‑code indices. High‑frequency index suppression is applied during inference to avoid placeholder tokens and enrich the resulting style vectors.

The inference pipeline can be summarized as:

# Pseudo‑code (illustrative)
style_indices = StyleGenerator(seed=style_code)
style_vector = StyleCodeBook.decode(style_indices)
image = T2I_Diffusion(prompt=text, style=style_vector)

Extended Capabilities

Style Fusion : Linear interpolation between two style code vectors produces blended styles.

Style‑image Reference : By feeding an existing style image through the CodeBook, the framework can use that style as guidance for generation, demonstrating compatibility with traditional style‑reference tasks.

Experimental Evaluation

CoTyle was evaluated on a style‑code reference benchmark against Midjourney’s proprietary style‑code feature. Results show:

Style Consistency : Images generated with the same style code but different prompts share a highly similar visual style, outperforming Midjourney.

Diversity : Slightly lower than Midjourney, attributed to suboptimal training of the Style Generator, indicating a direction for future improvement.

Additional experiments on style‑fusion and style‑image reference confirm strong style preservation and text‑image alignment.

Resources

Paper: https://arxiv.org/abs/2511.10555

Project page: https://Kwai-Kolors.github.io/CoTyle/

Demo (Hugging Face Space): https://huggingface.co/spaces/Kwai-Kolors/CoTyle

Code repository: https://github.com/Kwai-Kolors/CoTyle

Conclusion

CoTyle provides the first open‑source solution for the style‑code reference task, enabling the creation of novel image styles from discrete numeric codes and supporting downstream tasks such as style fusion and style‑guided synthesis. The released code and pretrained models facilitate further research in style‑code based image generation.

diffusion model Transformer Image Generation Generative AI style code

Written by

Kuaishou Tech

Official Kuaishou tech account, providing real-time updates on the latest Kuaishou technology practices.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.