Boost Security and Accuracy with a Multi‑Engine Voiceprint Fusion Service
This article introduces a multi‑engine voiceprint fusion service that combines Tencent Cloud, iFlytek, and a self‑developed model, detailing its architecture, intelligent scheduling, flexible API, technical highlights, typical high‑security scenarios, and usage specifications with example code for developers.
Background
In high‑security scenarios such as finance, security, and intelligent customer service, voiceprint recognition is an important biometric authentication method. However, single‑vendor solutions face challenges: unstable enrollment, varying real‑time and accuracy requirements, and difficulty unifying commercial services with self‑developed models.
Product Overview
We built a voiceprint fusion service that integrates Tencent Cloud, iFlytek, and a self‑developed engine, using flexible scheduling to provide higher availability, precision, and flexibility.
The overall solution consists of three core modules:
1️⃣ Engine Integration
Tencent Cloud Voiceprint : high‑availability, standardized API for large‑scale concurrent calls, meeting finance‑grade compliance.
iFlytek Voiceprint : excellent performance on Mandarin and multiple dialects, optimized for domestic language diversity.
Self‑Developed Voiceprint : based on a proprietary deep‑learning framework, supports basic voiceprint capability and horizontal scaling.
2️⃣ Intelligent Scheduling Strategy
Multi‑engine concurrent verification : the same audio can be submitted to multiple engines simultaneously, reducing single‑engine risk.
Unified result output : aggregate and compare results from all engines into a consistent API response.
Scalable foundation : supports elastic node expansion and load balancing.
3️⃣ Flexible Access
Cloud service support : deployed on high‑availability cloud infrastructure, on‑demand scaling.
Unified call entry : a single API endpoint abstracts vendor‑specific integration.
Technical Highlights
Multi‑engine fusion algorithm : submit audio to several engines, aggregate results for higher availability and accuracy.
Custom self‑developed model : noise robustness and short‑phrase compatibility.
High‑concurrency low‑latency architecture : micro‑service design, asynchronous task scheduling, distributed queues, auto‑failover and retry.
Unified API and management : consistent API, multi‑tenant isolation.
Typical Application Scenarios
Phone banking & remote account opening : multi‑channel verification, automatic model switching under poor network conditions.
Smart government identity verification : voiceprint + facial multi‑factor authentication, deployable in on‑premise data centers.
Enterprise attendance & access control : precise employee identification, integration with attendance systems.
Usage Specification
1️⃣ Interfaces
Create voiceprint repository
Add audio feature
Update audio feature
Delete specific audio feature
Delete repository
Feature compare 1:1 (audio vs repository)
Feature compare 1:1 v2 (audio vs audio)
Feature compare 1:N (audio vs repository)
2️⃣ Call Flow
Direct audio‑to‑audio comparison → use 1:1 v2.
Audio vs repository → create repository → add feature → compare 1:1.
Audio vs repository (multiple) → create repository → add feature → compare 1:N.
3️⃣ Example
Create repository
curl -X POST "http://localhost:8080/voiceprintRecognition/createRepository" \
-H "Content-Type: application/json" \
-H "x-zcs-version: v1" \
-H "Authorization:Bearer xxx" \
-d '{ }'Add audio feature
base64_output=$(base64 -w 0 test_audio.mp3)
curl -X POST http://localhost:8080/voiceprintRecognition/addAudioFeature \
-H "Content-Type: application/json" \
-H "x-zcs-version: v1" \
-H "Authorization: Bearer xxx" \
-d '{
"VoiceFormat": 2,
"SampleRate": 16000,
"Data": "'$base64_output'",
"GroupId": "OtvMnkXnQGGmcSCmHbDycbQaeZtsgkCV"
}'Feature compare 1:N
base64_data=$(base64 -w 0 test_audio.mp3)
curl -X POST http://localhost:8080/voiceprintRecognition/compareAudioFeatureN \
-H "Content-Type: application/json" \
-H "x-zcs-version: v1" \
-H "Authorization: Bearer xxx" \
-d '{
"VoiceFormat": 2,
"SampleRate": 16000,
"Data": "'$base64_data'",
"TopN": 10,
"GroupId": "OtvMnkXnQGGmcSCmHbDycbQaeZtsgkCV"
}'Cooperation and Access
The core functions are complete and the service is open for pilot integration across industries. Partners can access via the standard API to co‑develop and expand voiceprint applications.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
360 Zhihui Cloud Developer
360 Zhihui Cloud is an enterprise open service platform that aims to "aggregate data value and empower an intelligent future," leveraging 360's extensive product and technology resources to deliver platform services to customers.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
