Tag

word segmentation

0 views collected around this technical thread.

Model Perspective
Model Perspective
Sep 11, 2023 · Artificial Intelligence

Why Chinese Word Segmentation Matters: Techniques, Challenges, and Python Demo

This article explores Chinese word segmentation, illustrating its linguistic nuances with a humorous example, explains key methods—including dictionary‑based, statistical, and deep‑learning approaches—and provides Python code using a simple dictionary algorithm and the popular jieba library to demonstrate practical implementation.

Chinese NLPNatural Language Processingjieba
0 likes · 6 min read
Why Chinese Word Segmentation Matters: Techniques, Challenges, and Python Demo
DataFunTalk
DataFunTalk
Jul 30, 2021 · Artificial Intelligence

Fundamentals of Natural Language Processing: Language Models, Smoothing, and Basic Tasks

This article provides a comprehensive overview of natural language processing fundamentals, covering the challenges of language modeling, N‑gram and Markov assumptions, smoothing techniques such as discounting and add‑one, evaluation via perplexity, basic tasks like Chinese word segmentation, subword tokenization, POS tagging, syntactic and semantic parsing, and a range of downstream applications including information extraction, sentiment analysis, question answering, machine translation, and dialogue systems.

AINLPlanguage model
0 likes · 29 min read
Fundamentals of Natural Language Processing: Language Models, Smoothing, and Basic Tasks
php中文网 Courses
php中文网 Courses
Dec 1, 2020 · Backend Development

Using PHP FFI to Call the Cjieba Chinese Word Segmentation Library

This article demonstrates how to use PHP 7.4's FFI to directly call the Cjieba Chinese word‑segmentation library, explains common pitfalls such as uninitialized variables and pointer handling, shows code examples for compiling and running the library, and compares PHP's performance with native C.

BackendCjiebaFFI
0 likes · 6 min read
Using PHP FFI to Call the Cjieba Chinese Word Segmentation Library
Tencent Cloud Developer
Tencent Cloud Developer
Apr 24, 2019 · Artificial Intelligence

Chinese Text Sentiment Classification Using Multi‑layer LSTM: Data Preparation, Model Architecture, and Business Applications

The article details a practical workflow for Chinese sentiment classification in Tencent’s Goose Man product, covering data preparation, word‑segmentation challenges, a six‑layer multi‑LSTM architecture with word embeddings, training results achieving roughly 96 % accuracy, and its deployment for automatic detection of misleading and high‑impact user reviews.

Chinese NLPKerasLSTM
0 likes · 23 min read
Chinese Text Sentiment Classification Using Multi‑layer LSTM: Data Preparation, Model Architecture, and Business Applications