Target Re‑identification and Occluded Video Instance Segmentation: Applications in Insurance Claims and Pet Identification
The article introduces pet identity verification using target re‑identification and occluded video instance segmentation, describes recent ICCV VIPriors competitions where Ant Group’s insurance team achieved top ranks, and explains how these computer‑vision techniques are applied to insurance claims, pet identification, and future AI scenarios.
Let's start with a question.
Are the two dogs in the pictures below the same dog?
Answer: Yes.
This question tests pet identity verification. It can be done by eye, but only the pet's owner can confidently notice subtle differences between the two dogs.
Another approach: using "target re-identification" technology, a fundamental capability in visual recognition.
In this field, a world‑class competition was recently held: the Visual Inductive Priors (VIPriors) contest organized by ICCV, where the top score for the target re-identification track reached 97% accuracy. A participant from Ant Group's insurance technology team achieved third place with 94%.
Object recognition is a fundamental research area in computer vision, tasked with identifying what objects appear in an image and reporting their positions and orientations within the scene. Currently, object recognition methods can be divided into model‑based or context‑based approaches, and into 2D or 3D methods. Grimson summarized four widely accepted evaluation criteria: robustness, correctness, efficiency, and scope.
In the VIPriors contest, there is also an image instance segmentation track, where the team’s participant won second place.
Target re-identification and image instance segmentation are fundamental technologies in image object recognition, effectively addressing object identification needs in scenes.
For example, in underwriting and reimbursement, image instance segmentation can be used to extract irregularly shaped text in insurance claim scenarios, such as extracting the text region of an electronic seal on an electronic invoice, as shown below.
In addition to image recognition and segmentation contests, this year's ICCV also organized the Occluded Video Instance Segmentation (OVIS) competition.
The OVIS competition evaluates algorithms on videos with extensive occlusions among diverse objects, requiring detection, segmentation, and tracking of all objects.
Occluded video instance segmentation is a task that simultaneously classifies, segments, and tracks object instances in video, applicable to pet community video shooting and human‑pet interaction video capture.
In this competition, a participant from Ant Group's insurance technology team secured first place!
The first‑place certificate looks like this.
Instance segmentation is one of the fundamental problems in computer vision.
Currently, extensive research has been conducted on instance segmentation for static images, but studies on (occluded) video instance segmentation are relatively scarce. In the real world, cameras capture video streams—whether for real‑time perception in autonomous driving, short or long videos in online media, or document recognition in intelligent claims—rather than isolated images. Therefore, developing models for video understanding is of great significance.
Compared with image‑level instance segmentation, video‑level techniques can fully exploit cross‑frame continuity and temporal context cues, but they also demand higher computational resources.
Occluded video instance segmentation was introduced as a new task in 2019, attracting attention from companies such as Facebook, ByteDance, and Tencent, and remains in an early development stage.
This technology aids in claim document understanding, e‑commerce insurance product recognition, video interviews, and pet identity verification within insurance scenarios. Ant Insurance's intelligent claims service leverages it to handle these complex cases.
Currently, occluded video instance segmentation has been applied in Ant Insurance's intelligent claims workflow, significantly improving claim efficiency and accuracy.
For example, the technology can more conveniently identify claim documents in video streams, extracting the topmost document from a stack (see Figure 1).
Furthermore, in pet insurance, beyond nose‑print recognition, the algorithm can accurately segment three overlapping cats as shown in the four frames of the video (Figure 2), enabling more precise animal identity verification.
In the future, this technology will have broad value in corporate loan document uploads, autonomous driving scene understanding, and background separation for people in short videos or live streams.
(Figure 1: Claim document extraction)
(Figure 2: Occluded pet segmentation)
AntTech
Technology is the core driver of Ant's future creation.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.