Video Background Replacement Using RobustVideoMatting and Python
This tutorial explains how to use the open‑source RobustVideoMatting project to perform human portrait segmentation and replace video backgrounds, covering environment setup, model loading, custom image‑and‑video matting functions, and final video composition with OpenCV.
Many video chat applications allow users to change the background by extracting the human portrait and replacing the non‑human area, often using simple image replacement; however, with image‑segmentation techniques you can achieve more sophisticated effects similar to movie visual effects.
The article introduces the RobustVideoMatting project (https://github.com/PeterL1n/RobustVideoMatting) and shows how to clone the repository, install dependencies, and download a pre‑trained model (either rvm_mobilenetv3.pth or rvm_resnet50.pth).
git clone https://github.com/PeterL1n/RobustVideoMatting.git<br/>cd RobustVideoMatting<br/>pip install -r requirements_inference.txtAfter setting up, a Python script is created to load the model and run video matting:
import torch<br/>from model import MattingNetwork<br/>from inference import convert_video<br/><br/># Choose mobilenetv3 or resnet50<br/>model = MattingNetwork('mobilenetv3').eval().cuda() # or "resnet50"<br/>model.load_state_dict(torch.load('rvm_mobilenetv3.pth'))<br/><br/>convert_video(<br/> model,<br/> input_source='input.mp4',<br/> output_type='video',<br/> output_composition='com.mp4',<br/> output_alpha='pha.mp4',<br/> output_video_mbps=4,<br/> downsample_ratio=None,<br/> seq_chunk=12,<br/>)The script produces com.mp4 (a green‑screen video) and pha.mp4 (alpha mask). To perform custom image matting, the article defines a human_segment function that runs the model on a single image and returns a Pillow image with the segmented result:
import cv2<br/>import torch<br/>from PIL import Image<br/>from torchvision.transforms import transforms<br/>from model import MattingNetwork<br/><br/>device = "cuda" if torch.cuda.is_available() else "cpu"<br/>segmentor = MattingNetwork('resnet50').eval().cuda()<br/>segmentor.load_state_dict(torch.load('rvm_resnet50.pth'))<br/><br/>def human_segment(model, image):<br/> src = (transforms.PILToTensor()(image) / 255.)[None].to(device)<br/> with torch.no_grad():<br/> fgr, pha, *rec = model(src)<br/> segmented = torch.cat([src.cpu(), pha.cpu()], dim=1).squeeze(0).permute(1,2,0).numpy()<br/> segmented = cv2.normalize(segmented, None, 0, 255, cv2.NORM_MINMAX, cv2.CV_8U)<br/> return Image.fromarray(segmented)<br/><br/>human_segment(segmentor, Image.open('xscn.jpg')).show()For video matting, a loop reads frames, applies human_segment, and writes the result. An example using OpenCV is provided:
capture = cv2.VideoCapture("input.mp4")<br/>while True:<br/> ret, frame = capture.read()<br/> if not ret:<br/> break<br/> image = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)<br/> result = human_segment(segmentor, Image.fromarray(image))<br/> result = cv2.cvtColor(np.array(result), cv2.COLOR_RGB2BGR)<br/> cv2.imshow("result", result)<br/> cv2.waitKey(10)<br/> cv2.destroyAllWindows()The article then outlines the four steps for video background replacement: read foreground and background frames, perform matting on each foreground frame, composite the segmented foreground onto the new background, and write the composed frames to an output video.
A helper function change_background blends a segmented PNG image with a background image:
from PIL import Image<br/><br/>def change_background(image, background):<br/> w, h = image.size<br/> background = background.resize((w, h))<br/> background.paste(image, (0, 0), image)<br/> return backgroundFinally, the full video writing pipeline combines the foreground segmentation with a background video using OpenCV, handling differing video lengths and showing progress with tqdm:
# Read foreground and background videos<br/>capture = cv2.VideoCapture("input.mp4")<br/>capture_background = cv2.VideoCapture('background.mp4')<br/>fps = capture.get(cv2.CAP_PROP_FPS)<br/>width = int(capture.get(cv2.CAP_PROP_FRAME_WIDTH))<br/>height = int(capture.get(cv2.CAP_PROP_FRAME_HEIGHT))<br/>size = (width, height)<br/>fourcc = cv2.VideoWriter_fourcc(*'mp4v')<br/>out = cv2.VideoWriter('output.mp4', fourcc, fps, size)<br/>frames = min(capture.get(cv2.CAP_PROP_FRAME_COUNT), capture_background.get(cv2.CAP_PROP_FRAME_COUNT))<br/>bar = tqdm(total=frames)<br/>while True:<br/> ret1, frame1 = capture.read()<br/> ret2, frame2 = capture_background.read()<br/> if not ret1 or not ret2:<br/> break<br/> image = cv2.cvtColor(frame1, cv2.COLOR_BGR2RGB)<br/> segmented = human_segment(segmentor, Image.fromarray(image))<br/> background = Image.fromarray(cv2.cvtColor(frame2, cv2.COLOR_BGR2RGB))<br/> changed = change_background(segmented, background)<br/> changed = cv2.cvtColor(np.array(changed), cv2.COLOR_RGB2BGR)<br/> out.write(changed)<br/> bar.update(1)<br/>out.release()This complete workflow enables developers to replace video backgrounds without a green screen, leveraging deep‑learning‑based portrait matting and standard Python libraries.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Rare Earth Juejin Tech Community
Juejin, a tech community that helps developers grow.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
