Backend Development 16 min read

Resolving Duplicate OpenID Insertions with Distributed Locks in a Fast App Center

To prevent duplicate OpenID records caused by concurrent synchronization requests in the Fast App Center, this article analyzes the root cause, evaluates database‑level unique indexes versus application‑level distributed locks, and presents a Redis‑based lock implementation with cleanup procedures to ensure data consistency.

Sohu Tech Products

Aug 4, 2021

Resolving Duplicate OpenID Insertions with Distributed Locks in a Fast App Center

Many user‑facing internet services keep a copy of user data on the backend; the Fast App Center records users' collected apps using the OpenID as the account identifier and a local identifier on the client side. When the client starts, it synchronizes OpenID and the local identifier to the server, inserting a new row if the OpenID does not exist or updating the existing row otherwise.

After deployment, duplicate OpenID rows appeared in the t_account table. Although the queries used LIMIT 1 and thus were not affected functionally, the presence of duplicate rows indicated a concurrency problem.

Investigation showed that about 3% of OpenIDs had multiple rows with identical creation timestamps and consecutive auto‑increment IDs, suggesting that concurrent requests caused the duplicate inserts. Network latency, server load, and client‑side retry mechanisms could trigger multiple simultaneous synchronization calls.

The core issue is a classic concurrent‑write conflict: a "check‑then‑insert" pattern that is not atomic. Two typical mitigation strategies were considered:

Database‑level solution: add a UNIQUE index on the open_id column. The DDL is ALTER TABLE t_account ADD UNIQUE uk_open_id(open_id);. On a conflict MySQL returns error 1062, preventing duplicate rows.

Application‑level solution: use a distributed lock to serialize the critical section, ensuring that the check and insert/update happen atomically.

Because existing duplicate rows prevented the immediate creation of a UNIQUE index, the team chose the distributed‑lock approach.

Distributed locks must guarantee exclusive acquisition, high availability, performance, re‑entrancy, and automatic expiration. Common implementations include:

Database‑based lock tables (e.g., a myLock table with a UNIQUE method_name index).

Zookeeper‑based locks using sequential ephemeral nodes.

Redis‑based locks using the SET key value NX EX seconds command.

The Redis implementation was selected. A simple Java class RedisLock wraps JedisCluster and provides lock(openId) and unlock(openId) methods, using a 3‑second expiration:

public class RedisLock {
    private static final String LOCK_SUCCESS = "OK";
    private static final String LOCK_VALUE = "lock";
    private static final int EXPIRE_SECONDS = 3;
    @Autowired
    protected JedisCluster jedisCluster;
    public boolean lock(String openId) {
        String redisKey = formatRedisKey(openId);
        String ok = jedisCluster.set(redisKey, LOCK_VALUE, "NX", "EX", EXPIRE_SECONDS);
        return LOCK_SUCCESS.equals(ok);
    }
    public void unlock(String openId) {
        String redisKey = formatRedisKey(openId);
        jedisCluster.del(redisKey);
    }
    private String formatRedisKey(String openId){
        return "keyPrefix:" + openId;
    }
}

The original synchronization method was refactored to acquire the Redis lock before accessing the database, releasing it in a finally block. If the lock is not obtained, the request is discarded, preventing concurrent inserts.

public class AccountService {
    @Autowired
    private RedisLock redisLock;
    public void submit(String openId, String localIdentifier) {
        if (!redisLock.lock(openId)) {
            // concurrent request lost the lock, drop it
            return;
        }
        try {
            Account account = accountDao.find(openId);
            if (account == null) {
                // insert
            } else {
                // update
            }
        } finally {
            redisLock.unlock(openId);
        }
    }
}

To clean up the existing duplicate rows, a scheduled task runs every minute, deleting up to 1,000 duplicate OpenIDs per run, minimizing impact on database performance. Once cleanup is complete, the task is disabled.

The article concludes that systematic analysis, careful trade‑off evaluation, and a pragmatic distributed‑lock implementation can effectively resolve concurrency‑induced duplicate data problems in backend services.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Backend Java redis Data Consistency mysql distributed-lock

Written by

Sohu Tech Products

A knowledge-sharing platform for Sohu's technology products. As a leading Chinese internet brand with media, video, search, and gaming services and over 700 million users, Sohu continuously drives tech innovation and practice. We’ll share practical insights and tech news here.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.