Artificial Intelligence 11 min read

Absolute Semantic Recognition Competition: Feature Design, Modeling Strategy, and Core Algorithm Insights

This article presents a comprehensive solution to the absolute semantic recognition competition, detailing the problem background, dataset, evaluation metrics, feature engineering, model architecture—including Attention, Capsule, Bi‑GRU, and BERT—and analysis of results and lessons learned.

JD Tech Talk

Jul 24, 2019

Absolute Semantic Recognition Competition: Feature Design, Modeling Strategy, and Core Algorithm Insights

Abstract Based on the final presentation PPT, this article outlines the complete competition solution, focusing on feature design concepts, core modeling ideas, and algorithmic principles, covering Attention, Capsule, Bi‑GRU, and BERT.

Competition Background Semantic recognition is a crucial component of NLP with broad applications and high market value. The task aims to distinguish absolute expressions from contextual words to reduce advertising violations.

Dataset The training set contains 80,000 advertising sentences, with ~35,000 labeled as violating (label=1) and ~45,000 as non‑violating (label=0).

Goal Predict whether an advertising sentence violates regulations.

Evaluation Metric F‑score (β=1) calculated from precision and recall.

Algorithm Core Design

Feature Engineering Traditional TF‑IDF extracts contextual phrase features for absolute expressions, while word‑level and character‑level embeddings (Word2Vec, GloVe) capture contextual word features; BERT encodes contextual word vectors.

Model Architecture

TF‑IDF features fed to LR, LightGBM, XGBoost for absolute‑expression detection.

Word embeddings used in a TextNN network; Bi‑GRU and Capsule networks enrich sequential and spatial representations.

Attention Mechanism

Attention is introduced into Bi‑GRU and Bi‑LSTM to capture global and local dependencies, mitigating the weakness of long‑range sequence modeling.

Capsule TextNN

Capsule networks model vectors instead of scalars, capturing spatial structure; low‑level capsules receive RNN outputs, and dynamic routing builds high‑level semantic capsules for downstream dense layers.

Optimizer Improvement

The Adam optimizer is modified to apply L2 weight decay after learning‑rate scaling, decoupling them; warm‑up is used in the first epoch and learning rate is reduced in the second.

BERT Application

BERT‑Base Chinese (12‑layer, 768‑hidden, 110M parameters) provides deep bidirectional context; a max‑pooling layer after the transformer extracts the most expressive features for classification.

Result Analysis

Visualization of model weights shows strong detection of absolute expressions and varying influence of contextual words across different sentence orders.

Competition Experience Summary

The final solution used simple voting ensemble without stacking or blending; data augmentation was not applied due to time constraints.

Interesting Findings

Stage A’s second‑place score came from a BERT+LR hybrid model.

With data augmentation, TF‑IDF+LR alone could achieve top‑3 performance.

Final Remarks

Participating in algorithm competitions helps maintain competitive and learning states; sharing experiences across structured data, NLP, and computer vision fosters collective progress.

References

Devlin et al., BERT: Pre‑training of Deep Bidirectional Transformers for Language Understanding, 2018.

Loshchilov & Hutter, Fixing Weight Decay Regularization in Adam, 2017.

Raffel & Ellis, Feed‑forward networks with attention can solve some long‑term memory problems, 2015.

Zhao et al., Investigating capsule networks with dynamic routing for text classification, 2018.

He et al., Secaps: A sequence enhanced capsule model for charge prediction, 2019.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

machine learning attention NLP BERT Text Classification Capsule Networks Semantic Recognition

Written by

JD Tech Talk

Official JD Tech public account delivering best practices and technology innovation.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.