Tagged articles
7 articles
Page 1 of 1
AI Engineering
AI Engineering
Apr 28, 2026 · Artificial Intelligence

Insanely Fast Whisper speeds audio transcription 19× with Flash Attention 2

The open‑source Insanely Fast Whisper CLI tool leverages Flash Attention 2 to accelerate OpenAI Whisper transcription by 19 times—cutting a 2.5‑hour audio from 31 minutes to just 98 seconds on an Nvidia A100—while preserving accuracy and adding multilingual, speaker‑diarization, and precise timestamp features.

CLI toolFlash Attention 2GPU Acceleration
0 likes · 4 min read
Insanely Fast Whisper speeds audio transcription 19× with Flash Attention 2
Woodpecker Software Testing
Woodpecker Software Testing
Jan 25, 2026 · Artificial Intelligence

Integrating LLMs with Speech: Whisper, Vosk, and Alibaba Cloud in Python and JavaScript

This tutorial walks through setting up local speech recognition with OpenAI's Whisper and Vosk, leveraging Alibaba Cloud's ASR services, building a WebSocket server/client for real‑time audio streaming, capturing audio in the browser via MediaRecorder or RecordRTC, and performing speech synthesis with pyttsx3 and Alibaba's Sambert model.

Alibaba CloudJavaScriptPython
0 likes · 20 min read
Integrating LLMs with Speech: Whisper, Vosk, and Alibaba Cloud in Python and JavaScript
System Architect Go
System Architect Go
Nov 24, 2024 · Artificial Intelligence

Building a Web Voice Chatbot with Whisper, llama.cpp, and LLM

This article demonstrates how to build a web‑based voice chatbot by integrating Whisper speech‑to‑text, llama.cpp LLM inference, and WebSocket communication, detailing both the frontend JavaScript implementation and the Python FastAPI backend, along with Docker deployment and example code.

FastAPIJavaScriptLLM
0 likes · 10 min read
Building a Web Voice Chatbot with Whisper, llama.cpp, and LLM
Programmer DD
Programmer DD
May 16, 2023 · Artificial Intelligence

Inside OpenAI: How the Platform Democratizes Generative AI

Since its 2015 founding, OpenAI has built a suite of generative AI models—including GPT, DALL‑E, and Whisper—exposed via simple REST APIs, enabling developers to integrate advanced language, vision, and speech capabilities without deep ML expertise, while offering fine‑tuning, SDKs, and Azure integration.

APIDALL·EGPT
0 likes · 9 min read
Inside OpenAI: How the Platform Democratizes Generative AI