Video Background Replacement Using RobustVideoMatting and Python

This tutorial explains how to use the open‑source RobustVideoMatting project to perform human portrait segmentation and replace video backgrounds, covering environment setup, model loading, custom image‑and‑video matting functions, and final video composition with OpenCV.

Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Video Background Replacement Using RobustVideoMatting and Python

Many video chat applications allow users to change the background by extracting the human portrait and replacing the non‑human area, often using simple image replacement; however, with image‑segmentation techniques you can achieve more sophisticated effects similar to movie visual effects.

The article introduces the RobustVideoMatting project (https://github.com/PeterL1n/RobustVideoMatting) and shows how to clone the repository, install dependencies, and download a pre‑trained model (either rvm_mobilenetv3.pth or rvm_resnet50.pth).

git clone https://github.com/PeterL1n/RobustVideoMatting.git<br/>cd RobustVideoMatting<br/>pip install -r requirements_inference.txt

After setting up, a Python script is created to load the model and run video matting:

import torch<br/>from model import MattingNetwork<br/>from inference import convert_video<br/><br/># Choose mobilenetv3 or resnet50<br/>model = MattingNetwork('mobilenetv3').eval().cuda()  # or "resnet50"<br/>model.load_state_dict(torch.load('rvm_mobilenetv3.pth'))<br/><br/>convert_video(<br/>    model,<br/>    input_source='input.mp4',<br/>    output_type='video',<br/>    output_composition='com.mp4',<br/>    output_alpha='pha.mp4',<br/>    output_video_mbps=4,<br/>    downsample_ratio=None,<br/>    seq_chunk=12,<br/>)

The script produces com.mp4 (a green‑screen video) and pha.mp4 (alpha mask). To perform custom image matting, the article defines a human_segment function that runs the model on a single image and returns a Pillow image with the segmented result:

import cv2<br/>import torch<br/>from PIL import Image<br/>from torchvision.transforms import transforms<br/>from model import MattingNetwork<br/><br/>device = "cuda" if torch.cuda.is_available() else "cpu"<br/>segmentor = MattingNetwork('resnet50').eval().cuda()<br/>segmentor.load_state_dict(torch.load('rvm_resnet50.pth'))<br/><br/>def human_segment(model, image):<br/>    src = (transforms.PILToTensor()(image) / 255.)[None].to(device)<br/>    with torch.no_grad():<br/>        fgr, pha, *rec = model(src)<br/>        segmented = torch.cat([src.cpu(), pha.cpu()], dim=1).squeeze(0).permute(1,2,0).numpy()<br/>        segmented = cv2.normalize(segmented, None, 0, 255, cv2.NORM_MINMAX, cv2.CV_8U)<br/>        return Image.fromarray(segmented)<br/><br/>human_segment(segmentor, Image.open('xscn.jpg')).show()

For video matting, a loop reads frames, applies human_segment, and writes the result. An example using OpenCV is provided:

capture = cv2.VideoCapture("input.mp4")<br/>while True:<br/>    ret, frame = capture.read()<br/>    if not ret:<br/>        break<br/>    image = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)<br/>    result = human_segment(segmentor, Image.fromarray(image))<br/>    result = cv2.cvtColor(np.array(result), cv2.COLOR_RGB2BGR)<br/>    cv2.imshow("result", result)<br/>    cv2.waitKey(10)<br/>    cv2.destroyAllWindows()

The article then outlines the four steps for video background replacement: read foreground and background frames, perform matting on each foreground frame, composite the segmented foreground onto the new background, and write the composed frames to an output video.

A helper function change_background blends a segmented PNG image with a background image:

from PIL import Image<br/><br/>def change_background(image, background):<br/>    w, h = image.size<br/>    background = background.resize((w, h))<br/>    background.paste(image, (0, 0), image)<br/>    return background

Finally, the full video writing pipeline combines the foreground segmentation with a background video using OpenCV, handling differing video lengths and showing progress with tqdm:

# Read foreground and background videos<br/>capture = cv2.VideoCapture("input.mp4")<br/>capture_background = cv2.VideoCapture('background.mp4')<br/>fps = capture.get(cv2.CAP_PROP_FPS)<br/>width = int(capture.get(cv2.CAP_PROP_FRAME_WIDTH))<br/>height = int(capture.get(cv2.CAP_PROP_FRAME_HEIGHT))<br/>size = (width, height)<br/>fourcc = cv2.VideoWriter_fourcc(*'mp4v')<br/>out = cv2.VideoWriter('output.mp4', fourcc, fps, size)<br/>frames = min(capture.get(cv2.CAP_PROP_FRAME_COUNT), capture_background.get(cv2.CAP_PROP_FRAME_COUNT))<br/>bar = tqdm(total=frames)<br/>while True:<br/>    ret1, frame1 = capture.read()<br/>    ret2, frame2 = capture_background.read()<br/>    if not ret1 or not ret2:<br/>        break<br/>    image = cv2.cvtColor(frame1, cv2.COLOR_BGR2RGB)<br/>    segmented = human_segment(segmentor, Image.fromarray(image))<br/>    background = Image.fromarray(cv2.cvtColor(frame2, cv2.COLOR_BGR2RGB))<br/>    changed = change_background(segmented, background)<br/>    changed = cv2.cvtColor(np.array(changed), cv2.COLOR_RGB2BGR)<br/>    out.write(changed)<br/>    bar.update(1)<br/>out.release()

This complete workflow enables developers to replace video backgrounds without a green screen, leveraging deep‑learning‑based portrait matting and standard Python libraries.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Pythonimage segmentationOpenCVvideo mattingBackground ReplacementRobustVideoMatting
Rare Earth Juejin Tech Community
Written by

Rare Earth Juejin Tech Community

Juejin, a tech community that helps developers grow.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.