How GANs Turn Sketches into Realistic Landscapes: Inside the “TuYa” Algorithm
This article explains the GAN‑based “TuYa” sketch‑to‑landscape algorithm presented at the Yidian News Hackathon, detailing its semantic image synthesis approach, the encoder, generator with SPADE, and PatchGAN discriminator, and discusses potential applications for designers and architects.
1. Introduction
The second Yidian News Hackathon has concluded, and the team “Shenbi Malian” impressed the audience with their project “TuYa”. The system combines AI capabilities such as sketch‑to‑landscape, image eraser, and hand‑drawn face generation, allowing users without advanced drawing or Photoshop skills to create desired images through simple sketches.
Figure 1‑1 Sketch‑to‑landscape demo
Figure 1‑2 TuYa project overview
Figure 1‑3 Hand‑drawn face generation
2. Sketch‑to‑Landscape Algorithm
The algorithm is a GAN‑based image generator that can turn a few hand‑drawn contour lines into photorealistic scenes such as mountains, lakes, and blue skies. Given a semantic segmentation map, it synthesizes a corresponding realistic image, a task known as semantic image synthesis. Because semantic maps consist of simple lines, they are easy to create, which inspired the team’s “magic brush” nickname.
Figure 2 Effect showcase
2.1 Overall Network Structure
The network follows a conditional GAN architecture, consisting of an Encoder, a Generator, and a Discriminator that compete during training.
Figure 2‑1 Overall network diagram
The three modules are:
Encoder
Generator
Discriminator
2.2 Encoder Module
The Encoder extracts mean and variance from a real image using a stack of convolutional layers followed by two fully‑connected layers. These statistics define a distribution; sampling from it yields a latent vector that encodes the style of the input image.
Figure 2‑2‑1 Encoder network structure
After denormalizing the Gaussian‑sampled vector, the resulting random vector carries the real‑image information and serves as input to the Generator, enabling style‑controlled image synthesis.
Figure 2‑2‑2 Generated images with different styles
During the Hackathon the Encoder was simplified, so style selection was omitted; this can be added in future development.
2.3 Generator Module
The Generator learns a mapping from the input semantic mask to a photo‑level realistic image. It receives the random vector from the Encoder and incorporates multi‑scale semantic maps to provide contextual information, progressively refining the image from coarse to fine.
Figure 2‑3‑1 Generator structure
To avoid loss of semantic information caused by Batch Normalization, the Generator uses Spatially‑Adaptive Normalization (SPADE). SPADE takes the previous layer’s output and semantic maps at different scales, processes the semantic maps through a convolution, and then combines the results with the normalized features via element‑wise multiplication and addition, restoring semantic details.
Figure 2‑3‑2 SPADE module structure
The SPADE blocks are stacked with up‑sampling to form the full Generator architecture.
2.4 Discriminator
The Discriminator receives the concatenated semantic map and generated image, processing them through a series of layers to output a realism judgment. It follows the PatchGAN design used in pix2pixHD, producing an NxN map of real/fake scores for image patches rather than a single global score, which better captures high‑resolution details.
Figure 2‑4 Discriminator PatchGAN
Through the adversarial training of Generator and Discriminator, the system can generate realistic images from semantic sketches.
3. Future Outlook
The algorithm can empower architects, urban planners, landscape designers, game developers, advertising creators, and other image‑centric professions by providing a powerful tool for rapid virtual world creation. By leveraging AI to infer realistic appearances, designers can prototype high‑fidelity concepts directly during brainstorming.
Figure 3‑1 Future prospects illustration
Article sourced from the Yidian News Algorithm Team.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Cyber Elephant Tech Team
Official tech account of Cyber Elephant, a platform for the group's technology innovation, sharing, and communication.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
