Industry Insights 33 min read

This Week’s AI Pulse: GPT‑4o’s Exit, Full‑Duplex Voice, Open‑World AI & More

The weekly roundup analyzes OpenAI’s GPT‑4o leadership change, ByteDance’s Seeduplex full‑duplex voice breakthrough, JD.com and Meituan’s internal AI restrictions, Anthropic’s Claude Mythos leak and Glasswing response, Sam Altman’s AI‑society contract proposal, Anthropic’s token‑usage controversy, Google’s strategic outlook, AI‑driven marketing platforms, a 48 GB GPU performance comparison of Gemma and GPT‑OSS models, SentiAvatar’s 3D digital‑human innovation, and the launch of the low‑cost AI open‑world Elseland.

ZhongAn Tech Team

Apr 13, 2026

This Week’s AI Pulse: GPT‑4o’s Exit, Full‑Duplex Voice, Open‑World AI & More

OpenAI leadership shift : Joanne Jang, the architect behind GPT‑4o’s personality and behavior, announced her departure after four and a half years, citing personal reasons. She led the model‑behavior team that infused GPT‑4o with empathy, emojis, and a conversational tone that users treated as a companion. Her exit sparked community backlash and highlighted the tension between safety‑guardrails and the model’s “human‑like” qualities.

ByteDance’s full‑duplex voice model

On April 9, ByteDance’s AI research team released Seeduplex, a full‑duplex speech model integrated into the Doubao app. Unlike traditional half‑duplex systems that rely on voice‑activity detection, Seeduplex processes listening and speaking simultaneously using a single LLM to distinguish user speech, background noise, and pauses. Technical gains include ~250 ms lower pause‑detection latency, a 40 % reduction in AI‑talk‑over‑user incidents, and a 300 ms shorter response delay. In real‑world tests, pause‑handling improved by 8 % over half‑duplex baselines, though overall conversational fluency still trails human dialogue.

Enterprise AI restrictions in China

JD.com and Meituan have tightened internal AI usage policies. JD.com now blocks employee access to external LLMs (including ChatGPT, Gemini, Claude, etc.) and redirects them to its proprietary JoyAI model, with a limited approval process for exceptional cases. Meituan discourages the use of Alibaba’s Qwen model, preferring its own LongCat model while requiring justification for any external model usage. Both moves aim to protect data security and accelerate the adoption of self‑developed models.

Claude Mythos leak and Project Glasswing

Anthropic’s confidential Claude Mythos model was unintentionally exposed, revealing a system that autonomously discovered long‑standing vulnerabilities in OpenBSD and FFmpeg and assembled a full exploit chain in Linux benchmarks. Mythos outperformed the previous flagship Opus 4.6 on CyberGym (83.1 % vs 66.6 %) and achieved a near‑perfect score on SWE‑bench Verified (93.9 %). In response, Anthropic launched Project Glasswing, offering a $100 M usage grant to partners and delivering Mythos via regulated APIs on AWS Bedrock and Google Vertex AI to ensure traceability and safety.

Sam Altman’s AI‑society contract

OpenAI CEO Sam Altman warned that super‑intelligent AI will arrive faster than expected and called for a new social contract. Proposals include a public‑wealth fund funded by AI companies, a robot tax to support displaced workers, a four‑day workweek without wage cuts, and universal AI access as a basic right. Altman also hinted at a next‑generation model (rumored “Spud”) that could enable career‑defining discoveries.

Anthropic token‑usage controversy

Luo Fuli (罗福莉) explained that Anthropic’s subscription model was destabilized by third‑party harnesses like OpenClaw, which generated excessive low‑value tool calls and inflated token consumption. The resulting cost shock forces developers to improve context management and cache efficiency, shifting the industry focus from cheap tokens to “effective work per token”.

Google’s strategic outlook

Sundar Pichai reflected on Google’s missed early‑stage LLM opportunities, emphasized the steep value‑growth curve of AI, and warned of a 2026 “supply‑tightening year” for chips and memory. He outlined massive capital spending (US$1.75‑1.85 trillion) on infrastructure, including space‑based data centers, to sustain future AI workloads.

AI‑driven marketing platform

ByteDance’s “Pinstar Cloud AI” (品星云AI) connects brand strategy, content creation, and ad delivery in a closed loop. The system uses RAG‑enhanced LLMs to generate insights and scripts within minutes, automates influencer matching, and offers AI‑generated short‑form videos that cut production time from weeks to hours. Reported KPI lifts include a 50 % increase in influencer ROI and a 91 % rise in user‑generated AIGC submissions.

48 GB GPU model performance comparison

Using a single 48 GB RTX 4090 on Ubuntu 22.04 with CUDA 12.4, three open‑source LLMs—Gemma 4 26B (MoE), Gemma 4 31B (dense), and GPT‑OSS 20B—were benchmarked under INT4, INT8, and FP16 quantization via llama.cpp and vLLM. Results: Gemma 4 26B INT4 uses 18 GB VRAM, delivers 32 TPS single‑thread and 18 TPS at 8‑way concurrency with stable 128 K context. Gemma 4 31B INT4 consumes 24 GB VRAM, 25 TPS single‑thread, dropping to 12 TPS at 8‑way concurrency. GPT‑OSS 20B INT4 occupies 15 GB VRAM, peaks at 38 TPS single‑thread but shows weaker long‑context coherence. INT4 quantization proved optimal for single‑card deployment, and vLLM outperformed llama.cpp in concurrent workloads.

SentiAvatar 3D digital‑human breakthrough

SentiAvatar introduces a “plan‑then‑infill” two‑stage pipeline: an LLM‑based semantic planner generates sparse key‑frame action tokens from textual intent and audio cues; a Body‑Infill Transformer interpolates intermediate frames using continuous HuBERT audio features; a separate Face‑Infill Transformer produces facial expression tokens directly from audio. On the SuSuInterActs dataset (2.1 k clips, 37 h of Chinese dialogue), the system achieves 43.64 % R@1 on text‑action retrieval and sets new SOTA on BEATv2 (FGD 4.941, BC 8.078). End‑to‑end latency is ~0.53 s for a 6‑second motion, supporting unlimited streaming generation.

Elseland: a low‑cost AI open‑world

Former ByteDance engineer Liu Geng (北大博士) built Elseland, the first AI‑generated open world, in 49 days with a budget of US$5 k and 30 k lines of code. The platform offers a multi‑era, multi‑race storyline, a suite of editors (role, world, script, mini‑game), and AI‑assisted content creation where agents can generate maps, characters, and gameplay from a single prompt. Seventeen specialized agents handle development, testing, and asset generation, demonstrating that AI can compress months of work into days.

Overall, the week highlights accelerating AI capabilities—from full‑duplex speech and high‑performance local LLM deployment to AI‑powered content creation and governance—while also exposing emerging challenges around data security, token economics, and the societal impact of increasingly autonomous systems.

AI OpenAI Marketing AI 3D Avatar industry insights model performance voice AI Anthropic

Written by

ZhongAn Tech Team

China's first online insurer. Through tech innovation we make insurance simpler, warmer, and more valuable. Powered by technology, we support 50 billion RMB of policies and serve 600 million users with smart, personalized solutions. ZhongAn's hardcore tech and article shares are here.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.

ByteDance’s full‑duplex voice model

Enterprise AI restrictions in China

Claude Mythos leak and Project Glasswing

Sam Altman’s AI‑society contract

Anthropic token‑usage controversy

Google’s strategic outlook

AI‑driven marketing platform

48 GB GPU model performance comparison

SentiAvatar 3D digital‑human breakthrough

Elseland: a low‑cost AI open‑world

ZhongAn Tech Team

How this landed with the community

Was this worth your time?

0 Comments

48 GB GPU model performance comparison