Tagged articles

jieba

31 articles · Page 1 of 1

Jun 30, 2026 · Artificial Intelligence

NLP Study Notes: 4 Essential Steps for Preprocessing Chinese Text Corpora

This article walks through the four core steps of Chinese NLP corpus preparation—collecting data, cleaning it with regex and encoding detection, tokenizing using dictionary‑based or statistical methods such as jieba, HMM and CRF, and finally standardizing with stop‑word removal, vocabulary building and one‑hot encoding—while illustrating each step with concrete code snippets and practical examples.

CRFChineseNLP

0 likes · 12 min read

NLP Study Notes: 4 Essential Steps for Preprocessing Chinese Text Corpora

JavaEdge

Mar 12, 2025 · Artificial Intelligence

How to Analyze Chinese Sentiment Text Data: From Stats to Word Clouds

This article guides Java developers through a complete Chinese sentiment‑analysis dataset exploration, covering label distribution, sentence length statistics, vocabulary counts, adjective extraction, and visual word‑cloud generation using Python libraries such as pandas, seaborn, jieba, and wordcloud.

Data VisualizationNLPPython

0 likes · 10 min read

How to Analyze Chinese Sentiment Text Data: From Stats to Word Clouds

Open Source Tech Hub

Feb 20, 2025 · Backend Development

Build and Use jieba-php: Chinese Word Segmentation in PHP via Rust

This guide explains how to install the jieba-php extension— a Rust‑based Chinese word segmentation library for PHP—by listing required dependencies, showing the cargo build steps, demonstrating runtime commands, describing the provided API, and offering a complete usage example.

chinese segmentationjieba

0 likes · 2 min read

Build and Use jieba-php: Chinese Word Segmentation in PHP via Rust

Python Programming Learning Circle

Jan 9, 2025 · Fundamentals

Python Data Preprocessing and Visualization of Jay Chou Lyrics: From JSON to Word Cloud

This tutorial demonstrates how to convert a JSON lyric database into Excel, filter Jay Chou songs, perform Chinese word segmentation with Jieba, compute word frequencies, and create visualizations such as word clouds using Python code and online tools.

Data preprocessingPandasPython

0 likes · 9 min read

Python Data Preprocessing and Visualization of Jay Chou Lyrics: From JSON to Word Cloud

Test Development Learning Exchange

Nov 27, 2024 · Artificial Intelligence

Basic Natural Language Processing: Text Preprocessing and TF‑IDF with Python

This tutorial introduces fundamental natural language processing techniques, covering text preprocessing steps such as tokenization and stop‑word removal, followed by TF‑IDF feature extraction, and provides complete Python code examples to practice these concepts on a sample dataset.

NLPPythonScikit-learn

0 likes · 5 min read

Basic Natural Language Processing: Text Preprocessing and TF‑IDF with Python

Infra Learning Club

Oct 31, 2024 · Artificial Intelligence

What Is a Token in Large Language Models?

The article explains that a token is the unit processed by large language models, describes three common tokenizer methods—word‑level, character‑level, and sub‑word level—with English and Chinese examples, discusses their advantages and limitations, and shows how OpenAI’s tokenizer varies across model versions.

NLPTokenTokenization

0 likes · 5 min read

What Is a Token in Large Language Models?

Python Crawling & Data Mining

Sep 27, 2024 · Artificial Intelligence

Transform Crawled CSV Text into Word Clouds and Sentiment Analysis with Python

Learn step‑by‑step how to extract text from a CSV generated by a Python web crawler, clean it with stop‑words, create a word‑cloud visualization, compute word frequencies, and perform sentiment analysis using jieba and SnowNLP, with all code snippets provided.

PythonSentiment AnalysisSnowNLP

0 likes · 11 min read

Transform Crawled CSV Text into Word Clouds and Sentiment Analysis with Python

Python Programming Learning Circle

Dec 4, 2023 · Artificial Intelligence

Processing Chinese Lyrics Data with Python: From JSON Extraction to Word Cloud Visualization

This tutorial demonstrates how to preprocess a Chinese lyrics JSON dataset, extract Jay Chou's songs using Python, perform word segmentation with Jieba, compute word frequencies, and create visualizations such as word clouds both programmatically and with online tools.

Data preprocessingNLPjieba

0 likes · 9 min read

Processing Chinese Lyrics Data with Python: From JSON Extraction to Word Cloud Visualization

Python Programming Learning Circle

Dec 1, 2023 · Artificial Intelligence

Generating Word Cloud and Pie Chart from a News Article Using Python

This article demonstrates how to scrape a news webpage with Python, extract and segment its Chinese text using jieba, count word frequencies, and visualize the top ten terms as a word cloud and a pie chart with pyecharts.

PyechartsPythonWeb Scraping

0 likes · 3 min read

Generating Word Cloud and Pie Chart from a News Article Using Python

Model Perspective

Sep 11, 2023 · Artificial Intelligence

Why Chinese Word Segmentation Matters: Techniques, Challenges, and Python Demo

This article explores Chinese word segmentation, illustrating its linguistic nuances with a humorous example, explains key methods—including dictionary‑based, statistical, and deep‑learning approaches—and provides Python code using a simple dictionary algorithm and the popular jieba library to demonstrate practical implementation.

Chinese NLPPythonWord Segmentation

0 likes · 6 min read

Why Chinese Word Segmentation Matters: Techniques, Challenges, and Python Demo

Python Programming Learning Circle

Jun 19, 2023 · Fundamentals

Generating Word Cloud and Pie Chart from News Articles Using Python

This tutorial explains how to scrape a news article with Python, segment Chinese text, count word frequencies, and visualize the top ten words using a word cloud and a pie chart, providing complete code and sample results.

PyechartsPythondata-visualization

0 likes · 3 min read

Generating Word Cloud and Pie Chart from News Articles Using Python

Python Crawling & Data Mining

Jun 9, 2023 · Artificial Intelligence

Extract Word Document Keywords, Frequencies, and POS with Python

This guide shows how to use Python libraries such as docx, jieba, NLTK, and openpyxl to read a Word file, perform tokenization, compute word frequencies, assign part‑of‑speech tags, and export the results into an Excel spreadsheet, including troubleshooting tips for common errors.

NLPNLTKWord

0 likes · 7 min read

Extract Word Document Keywords, Frequencies, and POS with Python

Python Crawling & Data Mining

Mar 27, 2023 · Fundamentals

How to Compute Word Frequencies in Python with Jieba and Export to Excel

This article demonstrates how to calculate word frequencies in Chinese text using Python's jieba library, collections.Counter, and xlwt to export results to Excel, providing complete code examples and alternative approaches for effective text analysis.

Word Frequencyjiebatext processing

0 likes · 5 min read

How to Compute Word Frequencies in Python with Jieba and Export to Excel

Python Crawling & Data Mining

Feb 9, 2022 · Artificial Intelligence

How to Turn Crawled CSV Data into Word Clouds and Sentiment Scores with Python

This guide walks you through extracting text from a CSV obtained via Python web scraping, cleaning it with stop‑words, generating a word‑cloud, performing jieba tokenization and frequency analysis, and finally applying SnowNLP for sentiment scoring, with all code snippets and data links provided.

Sentiment AnalysisSnowNLPWeb Scraping

0 likes · 12 min read

How to Turn Crawled CSV Data into Word Clouds and Sentiment Scores with Python

Python Programming Learning Circle

Dec 16, 2021 · Artificial Intelligence

Part-of-Speech Tagging with Jieba in Python

This article explains how to perform Chinese part-of-speech tagging using the jieba.posseg library in Python, including loading stop words, extracting article content via Newspaper3k, applying precise mode segmentation, filtering, and presenting results in a pandas DataFrame.

NLPPOS taggingPython

0 likes · 3 min read

Part-of-Speech Tagging with Jieba in Python

Python Crawling & Data Mining

Dec 4, 2021 · Fundamentals

How to Scrape Web Text with Python and Visualize Word Frequencies

This article demonstrates how to use Python's requests and BeautifulSoup to crawl text from a news site, process it with collections, numpy, and jieba for word‑frequency analysis, and then visualize the top terms using pyecharts, providing complete code snippets and explanations.

Data VisualizationPyechartsWeb Scraping

0 likes · 7 min read

How to Scrape Web Text with Python and Visualize Word Frequencies

Python Programming Learning Circle

Nov 15, 2021 · Backend Development

Python Web Project: Visualizing Hot Search Rankings and Domestic COVID‑19 Cases with Flask, Web Scraping, and ECharts

This report describes a Python‑based web application built with Flask that scrapes hot‑search data from Weibo, Baidu and Zhihu, processes it using jieba and other libraries, and visualizes the results together with domestic COVID‑19 statistics using ECharts on a responsive front‑end page.

data-visualizationfrontendjieba

0 likes · 7 min read

Python Web Project: Visualizing Hot Search Rankings and Domestic COVID‑19 Cases with Flask, Web Scraping, and ECharts

Python Crawling & Data Mining

Jun 16, 2021 · Artificial Intelligence

Master Chinese Text Segmentation with jieba: Installation, Modes, and Advanced Tricks

This tutorial walks you through installing the jieba Python library, explains its three segmentation modes—precise, full, and search—demonstrates how to add or delete words, manage custom dictionaries, handle stop words, perform weight analysis, adjust word frequencies, and retrieve token positions, all with clear code examples and visual output.

NLPPythonTokenization

0 likes · 10 min read

Master Chinese Text Segmentation with jieba: Installation, Modes, and Advanced Tricks

Python Programming Learning Circle

Mar 30, 2020 · Fundamentals

Creating Chinese Word Clouds with Python: Using Jieba and WordCloud

This tutorial explains how to install and use the Jieba segmentation library and the WordCloud package in Python to process Chinese text, customize dictionaries and stopwords, and generate visually appealing word cloud images based on a mask picture.

TextProcessingjiebatutorial

0 likes · 7 min read

Creating Chinese Word Clouds with Python: Using Jieba and WordCloud

Python Crawling & Data Mining

Sep 20, 2019 · Artificial Intelligence

How to Scrape Douban Movie Reviews and Create a Chinese Word Cloud with Python

This tutorial shows how to log in to Douban with Python, scrape short comments for the animated film “Nezha”, process the text using jieba for Chinese word segmentation, and generate a visual word cloud that highlights the most frequent audience sentiments.

data-miningdoubanjieba

0 likes · 9 min read

How to Scrape Douban Movie Reviews and Create a Chinese Word Cloud with Python

MaGe Linux Operations

Jul 11, 2019 · Backend Development

How to Scrape JD Product Reviews and Create Word Clouds with Python

This tutorial walks you through analyzing JD product pages, extracting comment data via requests with proper headers, handling pagination, saving results, cleaning text using jieba, and visualizing frequent terms as a word cloud, all illustrated with step‑by‑step screenshots and code snippets.

Pythonjiebarequests

0 likes · 10 min read

How to Scrape JD Product Reviews and Create Word Clouds with Python

Python Crawling & Data Mining

Dec 20, 2018 · Artificial Intelligence

Create Stunning Word Cloud Visualizations from WeChat Moments with Python

This tutorial walks you through using Python's wordcloud and jieba libraries to process WeChat Moments data, generate frequency statistics, and produce attractive, shape‑based word cloud images, complete with code snippets and visual examples.

Pythonjiebatext-mining

0 likes · 6 min read

Create Stunning Word Cloud Visualizations from WeChat Moments with Python

Python Crawling & Data Mining

Nov 30, 2018 · Fundamentals

Create Stunning Word Clouds from WeChat Moments with Python: Step‑by‑Step Guide

Learn how to extract WeChat Moments data using Python web scraping, process the Chinese text with jieba, generate and customize word clouds with the wordcloud library, and enhance visual appeal using WordArt, all illustrated with detailed code snippets and screenshots.

Pythonjieba

0 likes · 5 min read

Create Stunning Word Clouds from WeChat Moments with Python: Step‑by‑Step Guide

MaGe Linux Operations

Nov 13, 2018 · Fundamentals

Create a One Piece‑Inspired Word Cloud with Python, Jupyter, and WordCloud

This tutorial guides programmers through installing required Python libraries, extracting keywords from One Piece song lyrics using jieba, and generating a Chopper‑shaped word cloud with the wordcloud package in a Jupyter notebook, complete with code examples and visual results.

Data VisualizationJupyterjieba

0 likes · 3 min read

Create a One Piece‑Inspired Word Cloud with Python, Jupyter, and WordCloud

MaGe Linux Operations

Sep 12, 2018 · Fundamentals

How to Install, Fix, and Use Python WordCloud – A Complete Guide

This tutorial walks you through installing the WordCloud and jieba packages, resolving common Windows compilation errors, handling Chinese font encoding issues, and creating both basic and image‑masked word clouds with practical code examples and screenshots.

Installationencodingjieba

0 likes · 5 min read

How to Install, Fix, and Use Python WordCloud – A Complete Guide

MaGe Linux Operations

Aug 18, 2018 · Artificial Intelligence

Create a Heart‑Shaped WeChat Word Cloud with Python: Step‑by‑Step Guide

This article explains how to build a Python tool that monitors real‑time WeChat chats, uses wxpy to capture messages, applies jieba for Chinese word segmentation, and generates a heart‑shaped word cloud with the wordcloud library, complete with code examples and setup instructions.

WeChatjiebaword cloud

0 likes · 6 min read

Create a Heart‑Shaped WeChat Word Cloud with Python: Step‑by‑Step Guide

Python Crawling & Data Mining

May 15, 2018 · Fundamentals

Create Stunning Word Clouds from WeChat Moments Using Python and Jieba

This tutorial walks you through extracting WeChat Moments with a Python web scraper, processing the Chinese text using jieba, and visualizing the most frequent words as beautiful word clouds with customizable shapes and fonts.

Data VisualizationPythonWeb Scraping

0 likes · 6 min read

Create Stunning Word Clouds from WeChat Moments Using Python and Jieba

MaGe Linux Operations

Mar 22, 2018 · Artificial Intelligence

Mapping Character Relationships in 'Heavenly Sword and Dragon Slaying' with Jieba, Word2Vec & NetworkX

This article demonstrates how to combine Jieba segmentation, Word2Vec embeddings, and NetworkX graph visualization to extract and analyze character relationships from the Chinese novel "Heavenly Sword and Dragon Slaying," detailing data preparation, model training, entity matrix construction, and network graph generation.

Graph VisualizationNLPPython

0 likes · 10 min read

Mapping Character Relationships in 'Heavenly Sword and Dragon Slaying' with Jieba, Word2Vec & NetworkX

MaGe Linux Operations

Jun 17, 2017 · Artificial Intelligence

Create a One Piece‑Inspired Word Cloud with Python, Jieba, and WordCloud

This tutorial guides readers with basic programming experience through using Python 3 in Jupyter to extract keywords from One Piece lyrics with jieba, and then generate a Chopper‑shaped word cloud using matplotlib and the wordcloud library, covering required dependencies and step‑by‑step code.

Data VisualizationJupyterPython

0 likes · 3 min read

Create a One Piece‑Inspired Word Cloud with Python, Jieba, and WordCloud

MaGe Linux Operations

Jun 2, 2017 · Artificial Intelligence

Create Chinese Word Clouds in Minutes with Python’s jieba and wordcloud

This tutorial shows how to use the Python libraries jieba and wordcloud to quickly generate Chinese word‑cloud visualizations, explaining required dependencies, code structure, and usage tips so readers can produce word clouds from any Chinese text in just minutes.

Chinese textPythonjieba

0 likes · 2 min read

Create Chinese Word Clouds in Minutes with Python’s jieba and wordcloud

MaGe Linux Operations

Apr 9, 2017 · Artificial Intelligence

How to Install and Fix WordCloud in Python for Chinese Text Visualization

This guide walks you through installing the Python WordCloud library, resolving common compilation errors, handling Chinese font encoding issues, and creating basic and image‑masked word clouds, complete with code snippets and troubleshooting tips for smooth visualization of Chinese text data.

Chinese NLPPythonjieba

0 likes · 4 min read

How to Install and Fix WordCloud in Python for Chinese Text Visualization