Artificial Intelligence 7 min read

Open-Source AI Portrait Generation with FaceChain: Setup, Usage, and Underlying Principles

This article introduces the open‑source FaceChain AI portrait generation project, detailing background, demo results, environment setup on ModelScope notebooks, step‑by‑step usage instructions, and an in‑depth explanation of the Stable Diffusion‑based LoRA training and inference pipeline.

DataFunTalk
DataFunTalk
DataFunTalk
Open-Source AI Portrait Generation with FaceChain: Setup, Usage, and Underlying Principles

The article begins by describing the rapid popularity of AI portrait software for both official documents and artistic styles, and announces that Alibaba's DAMO Academy Visual team has released an open‑source version to encourage community contributions.

It showcases example outputs such as business ID photos and stylized portraits, illustrating the quality of the generated images.

Environment Configuration and Installation

The guide uses ModelScope’s PAI‑DSW notebook environment (single‑GPU, ~20 GB VRAM). It walks through accessing ModelScope, selecting a GPU notebook, opening a terminal, and verifying GPU memory with nvidia‑smi .

Installation commands are provided:

git clone https://www.modelscope.cn/studios/CVstudio/cv_human_portrait.git cd cv_human_portrait pip install -r requirements.txt pip install gradio==3.35.2 python app.py

After running these steps, a personal portrait web app is ready.

Usage Steps

Step 1: Upload 3–10 clear head‑shoulder photos (faces must be unobstructed).

Step 2: Click “Image Customization” to start model training, which takes about 15 minutes.

Step 3: Switch to “Image Experience” to generate stylized portrait images.

The source code is available at https://github.com/modelscope/facechain , and the ModelScope demo can be accessed via the provided link.

Principle Explanation

The core technology leverages Stable Diffusion’s text‑to‑image capability, enhanced with two LoRA fine‑tuning models: a style LoRA trained offline and a face LoRA trained online from user‑uploaded images. LoRA introduces a small set of trainable parameters to inject specific visual information into the diffusion model.

Training Stage : User images are first corrected for pose using a rotation model, then refined with face detection, key‑point alignment, human parsing, and skin‑retouching models to produce high‑quality training data. Attribute labeling combines face attribute and text annotation models, after which the face LoRA is fine‑tuned on Stable Diffusion.

Inference Stage : The trained face LoRA and style LoRA are merged into Stable Diffusion. Text prompts generate initial portraits, which are then refined by a face‑fusion model using selected template faces. A face‑recognition model scores similarity to rank the final outputs.

The article lists all involved models with links, covering face detection, rotation, human parsing, skin retouching, attribute recognition, text annotation, template face selection, face fusion, and face recognition.

Open‑Source Recruitment

The FaceChain project is open‑source, inviting community contributions to expand style LoRA models, explore adaptive base‑model and multi‑LoRA fusion, develop specialized face‑prompt models, improve SD portrait base models, and create advanced applications such as memes, animated characters, and game assets.

LoRAStable Diffusionopen sourceModelScopeAI Portrait
DataFunTalk
Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.