Artificial Intelligence 4 min read

Detecting COVID‑19 Public Sentiment with Chinese BERT: Competition Walkthrough

This article outlines the COVID‑19 public sentiment detection competition, detailing the three‑class classification task, data cleaning and exploratory analysis, a Chinese BERT baseline that reaches a 0.726 macro‑F1 score, submission pitfalls, and recommended further reading.

Baobao Algorithm Notes

Feb 29, 2020

Detecting COVID‑19 Public Sentiment with Chinese BERT: Competition Walkthrough

Competition Overview

The “Pandemic Public Sentiment Identification” challenge (https://www.datafountain.cn/competitions/423) was organized by the Beijing Economic and Information Technology Bureau and the China Computer Federation’s Big Data Committee. The goal is to support epidemic control and post‑pandemic recovery by applying big data, AI, and cloud‑computing techniques to social‑media data.

Task Description

Participants must classify each social‑media post into one of three sentiment polarities: -1 (negative), 0 (neutral), or 1 (positive). The official evaluation metric is the macro‑averaged F1 score.

Data Exploration

Initial inspection revealed noisy label values—many unexpected symbols appeared alongside the three valid classes. After removing these corrupt entries, the cleaned label column contains only -1, 0, and 1. A temporal analysis shows a rapid increase in posting activity from 2020‑01‑01, with a peak around the Chinese New Year and the Dr. Li Wenliang incident (approximately 2020‑02‑02 to 2020‑02‑10).

Baseline Model

A baseline was built using the chinese-bert-base transformer model with 5‑fold cross‑validation. The resulting macro‑F1 score is 0.726. The full training and inference script can be obtained by replying “疫情代码” to the competition backend.

Submission Tip

When generating the submission file, append a trailing space after each sample ID. Omitting this space triggers a platform error and causes the submission to be rejected.

Detecting COVID‑19 Public Sentiment with Chinese BERT: Competition Walkthrough

Competition Overview

Task Description

Data Exploration

Baseline Model

Submission Tip

Further Reading

Baobao Algorithm Notes

How this landed with the community

Was this worth your time?

0 Comments