How I Overcame Data Import Pitfalls and Canal Crashes in Production

This article recounts a series of real‑world challenges—including Excel‑to‑SQL imports, temporary table handling, Canal service failures, jar conflicts, and deployment mistakes—and explains the step‑by‑step solutions that restored system stability and delivered the required data to operations.

Su San Talks Tech
Su San Talks Tech
Su San Talks Tech
How I Overcame Data Import Pitfalls and Canal Crashes in Production

Preface

Recently an online system incident occurred: users created products that were invisible in the marketplace list and could not be searched.

The issue lasted about half an hour and turned out to be my own mistake.

1. From the Requirement

1.1 Background

We were approaching a major release and needed to match third‑party product attributes with our system attributes in an Excel sheet.

Operations originally planned to map the attributes manually in Excel, but the volume was too large.

They asked me to write a program to map the attributes quickly.

One requirement was that the solution be fast.

1.2 Original Requirement

Operations provided an Excel file containing categories and attribute fields and wanted the program to return matching attribute IDs and names in two new columns.

Unmatched rows would be entered manually into the system later.

1.3 Added Scope

Operations later added two vendors, each with different Excel formats, and some attributes contained range values that needed to be split before matching.

They also needed a special export for vendor 1 after manually removing some rows.

2. The Fastest Solution

I prioritized the tasks and decided to use direct SQL queries for the first three requirements.

Export vendor 1 attribute data.

Export vendor 2 attribute data.

Export special attribute data for vendor 1.

Export data that matches the range values.

To avoid writing four separate programs, I imported the Excel data into a dev database using Navicat Premium’s Import Wizard:

After importing, I exported the table structure and data with Navicat’s Dump SQL File (Structure + Data) to a .sql file.

Because each row generated an insert statement, I combined them into a single statement and compressed the SQL using an online tool ( https://tool.lu/sql) before executing in production.

The first three requirements were completed quickly.

3. An Interlude

After delivering the data, a temporary table remained in production. When the Canal service later crashed, we discovered that the Chinese table name caused garbled characters, leading to an exception.

We deleted the temporary table and restarted Canal, which restored normal operation.

4. Online Issues

The next day Canal crashed again with the error “column size is not match for table xxxx 8 vs 9”. The cause was a cache mismatch; deleting meta.dat and restarting Canal fixed it.

5. Determining the Solution for Requirement 4

Requirement 4 required importing Excel data, matching a range, writing results back to two new columns, and returning the file.

I considered three approaches:

Deploy a Spring Boot executable JAR.

Use an existing XXL‑Job.

Expose an API endpoint.

I chose the API because it allows direct file upload via Postman, processes the stream without storing the file on the server, and returns the result as a downloadable stream.

Security was handled by exposing the endpoint only in the pre‑environment on a separate branch (hotfix) and protecting it with a gateway proxy.

6. Jar Conflict

The API used Alibaba’s easyexcel. In the pre environment the class was missing due to a version mismatch; updating the Maven dependency resolved the issue.

7. Pre‑Environment Network Issue

Later the pre environment’s proxy failed because network bandwidth was saturated, causing requests to be routed to production instead.

After the proxy recovered, the API worked again.

8. Deploying the Wrong Branch

A deployment mistake pushed the master branch instead of hotfix, breaking the API. Redeploying the correct hotfix branch restored functionality.

9. Summary

Key lessons learned:

Never use Chinese table or column names in production.

Delete temporary tables promptly.

Limit Canal to only the tables it needs to monitor.

Ensure library versions are compatible.

Consider proxy failures in pre environments.

Verify the correct branch is deployed.

Delete meta.dat to resolve many Canal issues.

Combine multiple insert statements for better performance.

Use easyexcel for Excel import/export.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

BackendDebuggingSQLCanalExcelDataImport
Su San Talks Tech
Written by

Su San Talks Tech

Su San, former staff at several leading tech companies, is a top creator on Juejin and a premium creator on CSDN, and runs the free coding practice site www.susan.net.cn.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.