Backend Development 5 min read

Analysis of the Redis‑py Bug that Caused a ChatGPT Data Leak

A recent Redis‑py library vulnerability caused ChatGPT to expose personal data of about 1.2% of Plus users, prompting an OpenAI apology, a detailed post‑mortem, and a series of backend and security fixes to prevent similar incidents.

DataFunSummit

Mar 29, 2023

Analysis of the Redis‑py Bug that Caused a ChatGPT Data Leak

Recently, a bug in the open‑source Redis client library redis‑py triggered a major incident for ChatGPT, allowing a small subset of users to view other users' personal information and chat queries.

OpenAI CEO Sam Altman publicly apologized on Twitter, explaining that the issue stemmed from a vulnerability in the Redis client, that a patch had been released, and that only a limited number of users were affected.

In a post‑mortem report, OpenAI detailed that the flaw originated in the redis‑py library used to cache user data, which caused leaked data such as subscriber names, email addresses, billing addresses, the last four digits of credit cards, and expiration dates. The affected users represented about 1.2% of ChatGPT Plus subscribers.

The technical investigation revealed that OpenAI caches user information in Redis to avoid database lookups for each request, employs Redis Cluster for load distribution, and uses the async‑enabled redis‑py client to integrate with an asyncio‑based Python server. The client maintains a shared connection pool, handling requests and responses via inbound and outbound queues.

A race condition occurs when a request is cancelled after being queued but before its response is dequeued, leading to a broken connection and the possibility of subsequent unrelated requests receiving leftover data from the previous connection. In most cases this results in an unrecoverable server error, but occasionally the corrupted data matches the expected type, causing another user’s data to be returned.

On March 20 (Pacific Time), a configuration change at OpenAI unintentionally increased the frequency of cancelled Redis requests, amplifying the error. The issue was isolated to the async redis‑py client used with Redis Cluster and has since been fixed.

OpenAI has applied the patch, expanded the Redis cluster, added additional validation checks to ensure cached data matches the requesting user, and conducted extensive testing to prevent recurrence.

The incident underscores the ongoing security challenges inherent in rapid AI deployment, highlighting the need for robust data handling, thorough testing, and continuous monitoring of backend infrastructure.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

backend Redis ChatGPT security asyncio DataLeak

Written by

DataFunSummit

Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.