Tag

data preprocessing

0 views collected around this technical thread.

Python Programming Learning Circle
Python Programming Learning Circle
Jan 9, 2025 · Fundamentals

Python Data Preprocessing and Visualization of Jay Chou Lyrics: From JSON to Word Cloud

This tutorial demonstrates how to convert a JSON lyric database into Excel, filter Jay Chou songs, perform Chinese word segmentation with Jieba, compute word frequencies, and create visualizations such as word clouds using Python code and online tools.

VisualizationWordClouddata preprocessing
0 likes · 9 min read
Python Data Preprocessing and Visualization of Jay Chou Lyrics: From JSON to Word Cloud
Test Development Learning Exchange
Test Development Learning Exchange
Dec 5, 2024 · Artificial Intelligence

End-to-End House Prices Prediction Project: Data Collection, Preprocessing, Modeling, Evaluation, and Deployment with Python

This tutorial walks through a complete house price prediction project, covering data collection from Kaggle, preprocessing with pandas and scikit‑learn, model training using RandomForestRegressor, evaluation, and deployment of a Flask API for real‑time predictions, providing full code examples.

FlaskModel DeploymentPython
0 likes · 9 min read
End-to-End House Prices Prediction Project: Data Collection, Preprocessing, Modeling, Evaluation, and Deployment with Python
Test Development Learning Exchange
Test Development Learning Exchange
Nov 26, 2024 · Artificial Intelligence

Comprehensive Python Tutorial for Data Preprocessing, Feature Engineering, Model Training, Evaluation, and Deployment

This tutorial walks through consolidating the first ten days of learning by covering data preprocessing, feature engineering, model training with linear regression, decision tree, and random forest, model evaluation using cross‑validation, and finally saving and loading the best model, all illustrated with complete Python code examples.

Feature EngineeringPythondata preprocessing
0 likes · 9 min read
Comprehensive Python Tutorial for Data Preprocessing, Feature Engineering, Model Training, Evaluation, and Deployment
Test Development Learning Exchange
Test Development Learning Exchange
Nov 21, 2024 · Artificial Intelligence

Data Preprocessing: Standardization, Normalization, and Missing Value Imputation with Python

This tutorial demonstrates how to perform essential data preprocessing techniques—including standardization, min‑max normalization, and various missing‑value imputation methods—using pandas and scikit‑learn in Python, providing code examples and explanations to help you prepare datasets for machine‑learning models.

PythonStandardizationdata preprocessing
0 likes · 6 min read
Data Preprocessing: Standardization, Normalization, and Missing Value Imputation with Python
Test Development Learning Exchange
Test Development Learning Exchange
Oct 29, 2024 · Artificial Intelligence

Data Preprocessing and Modeling with Pandas and Scikit‑learn

This guide walks through using Pandas for data cleaning, feature engineering, and preparation, then demonstrates building, evaluating, and persisting a machine‑learning model with Scikit‑learn's pipeline and RandomForestClassifier in Python.

Pythondata preprocessingmachine learning
0 likes · 5 min read
Data Preprocessing and Modeling with Pandas and Scikit‑learn
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Jul 7, 2024 · Artificial Intelligence

Daily and Sports Activities Dataset: Description, Preprocessing Pipeline, and CNN Classification Results

This article introduces the Daily_and_Sports_Activities sensor dataset, details its structure and characteristics, provides a Python preprocessing pipeline with sliding‑window segmentation and Z‑score normalization, and reports CNN training results achieving 87.93% accuracy on activity classification.

CNNUCIdata preprocessing
0 likes · 9 min read
Daily and Sports Activities Dataset: Description, Preprocessing Pipeline, and CNN Classification Results
php中文网 Courses
php中文网 Courses
Jun 13, 2024 · Artificial Intelligence

Using PHP for Data Dimensionality Reduction and Feature Extraction

This article explains the importance of data dimensionality reduction and feature extraction in machine learning, and provides a step‑by‑step guide with PHP code examples—including library installation, data preprocessing, PCA‑based reduction, and feature selection techniques—demonstrating how to handle large datasets efficiently.

Feature ExtractionPCAPHP
0 likes · 6 min read
Using PHP for Data Dimensionality Reduction and Feature Extraction
Python Programming Learning Circle
Python Programming Learning Circle
Apr 27, 2024 · Fundamentals

Data Cleaning Techniques in Python: 21 Practical Examples and Code

This article provides a comprehensive guide to data cleaning in Python, covering common data issues, methods for handling missing values, duplicates, categorical inconsistencies, and text normalization, illustrated with 21 detailed code examples using pandas and matplotlib.

Pythondata analysisdata cleaning
0 likes · 16 min read
Data Cleaning Techniques in Python: 21 Practical Examples and Code
HelloTech
HelloTech
Mar 14, 2024 · Artificial Intelligence

Feature Engineering: Concepts, Methods, and Automation

Feature engineering transforms existing data into new predictive variables through manual analysis or automated pipelines, encompassing single‑variable encoding, pairwise arithmetic, group‑statistics, multi‑variable combinations, time‑series and text derivations, with tools like Deep Feature Synthesis and beam‑search to generate and select useful features.

Feature Engineeringautomated featuresdata preprocessing
0 likes · 17 min read
Feature Engineering: Concepts, Methods, and Automation
Test Development Learning Exchange
Test Development Learning Exchange
Jan 23, 2024 · Fundamentals

Common Data Preprocessing Techniques with Python Code Examples

This article presents ten essential data preprocessing methods—including handling missing values, type conversion, standardization, encoding, smoothing, outlier treatment, text cleaning, word frequency counting, sentiment analysis, and topic modeling—each explained with clear Python code snippets.

Pythondata cleaningdata preprocessing
0 likes · 9 min read
Common Data Preprocessing Techniques with Python Code Examples
Test Development Learning Exchange
Test Development Learning Exchange
Dec 4, 2023 · Fundamentals

Common Data Cleaning Techniques with Python Code Examples

This article presents a comprehensive collection of Python code snippets demonstrating essential data cleaning methods—including handling missing values, outlier detection, type conversion, formatting, duplicate removal, normalization, one‑hot encoding, text preprocessing, and dataset merging—providing practical guidance for preparing data for analysis or machine‑learning tasks.

Pythondata cleaningdata preprocessing
0 likes · 7 min read
Common Data Cleaning Techniques with Python Code Examples
Python Programming Learning Circle
Python Programming Learning Circle
Dec 4, 2023 · Artificial Intelligence

Processing Chinese Lyrics Data with Python: From JSON Extraction to Word Cloud Visualization

This tutorial demonstrates how to preprocess a Chinese lyrics JSON dataset, extract Jay Chou's songs using Python, perform word segmentation with Jieba, compute word frequencies, and create visualizations such as word clouds both programmatically and with online tools.

NLPVisualizationWordCloud
0 likes · 9 min read
Processing Chinese Lyrics Data with Python: From JSON Extraction to Word Cloud Visualization
Test Development Learning Exchange
Test Development Learning Exchange
Nov 11, 2023 · Artificial Intelligence

Python Techniques for Comprehensive Text Data Analysis

This guide demonstrates how to use Python for end‑to‑end text data analysis, covering preprocessing, word‑frequency visualization, classification, sentiment detection, similarity measurement, entity recognition, keyword extraction, summarization, translation, and generation with clear code examples.

NLPPythondata preprocessing
0 likes · 6 min read
Python Techniques for Comprehensive Text Data Analysis
Test Development Learning Exchange
Test Development Learning Exchange
Oct 10, 2023 · Artificial Intelligence

Feature Engineering Techniques for Various Business Scenarios with Python Code Examples

This article presents practical feature‑engineering methods for ten common business domains, explaining the purpose of each feature, the extraction technique, and providing ready‑to‑run Python code snippets to help build more accurate predictive models.

Feature Engineeringbusiness analyticsdata preprocessing
0 likes · 7 min read
Feature Engineering Techniques for Various Business Scenarios with Python Code Examples
政采云技术
政采云技术
Sep 21, 2023 · Fundamentals

RFM Analysis: A Comprehensive Guide to Customer Segmentation and Marketing Optimization

RFM analysis is a powerful tool for understanding customer value and behavior, enabling businesses to optimize marketing strategies, improve customer satisfaction, and increase business revenue through data-driven customer segmentation.

Customer Lifecycle ManagementData AnalyticsMarketing Optimization
0 likes · 16 min read
RFM Analysis: A Comprehensive Guide to Customer Segmentation and Marketing Optimization
DaTaobao Tech
DaTaobao Tech
Sep 11, 2023 · Artificial Intelligence

Large Language Model Upgrade Paths and Architecture Selection

This article analyzes upgrade paths of major LLMs—ChatGLM, LLaMA, Baichuan—detailing performance, context length, and architectural changes, then examines essential capabilities, data cleaning, tokenizer and attention design, and offers practical guidance for balanced scaling and efficient model construction.

BaichuanChatGLMLLM architecture
0 likes · 32 min read
Large Language Model Upgrade Paths and Architecture Selection
Test Development Learning Exchange
Test Development Learning Exchange
Aug 20, 2023 · Fundamentals

Key Steps and Techniques for Data Cleaning with Python Pandas

This article outlines essential data cleaning steps—including handling missing and duplicate values, type conversion, outlier treatment, text processing, standardization, sampling, and merging—providing concise Python pandas code snippets for each technique to improve data quality for analysis.

Big DataPythondata cleaning
0 likes · 5 min read
Key Steps and Techniques for Data Cleaning with Python Pandas
Python Programming Learning Circle
Python Programming Learning Circle
Jun 17, 2023 · Big Data

Accelerating Python Data Preprocessing with Multiprocessing in Three Lines of Code

This article demonstrates how to use Python's concurrent.futures module to parallelize image resizing, turning a single‑process script into a multi‑core solution with just three additional lines of code, achieving up to a six‑fold speed‑up on typical CPUs.

Parallel ComputingPythonconcurrent.futures
0 likes · 7 min read
Accelerating Python Data Preprocessing with Multiprocessing in Three Lines of Code
Model Perspective
Model Perspective
Mar 3, 2023 · Fundamentals

Unlock Hidden Patterns: A Practical Guide to Factor Analysis with Python

Factor analysis, a statistical technique for uncovering underlying common factors among variables, is explained alongside its distinction from PCA, detailed procedural steps, adequacy tests, and a hands‑on Python implementation using the factor_analyzer library with visualizations and factor rotation methods.

Pythondata preprocessingdimensionality reduction
0 likes · 10 min read
Unlock Hidden Patterns: A Practical Guide to Factor Analysis with Python
Python Programming Learning Circle
Python Programming Learning Circle
Mar 3, 2023 · Backend Development

Accelerating Image Pre‑processing in Python with a Three‑Line Multiprocessing Trick

This article demonstrates how to boost the speed of image‑preprocessing tasks in Python by replacing a conventional single‑process loop with a three‑line concurrent.futures ProcessPoolExecutor implementation, achieving up to six‑fold performance gains on multi‑core CPUs.

Pythonconcurrent.futuresdata preprocessing
0 likes · 7 min read
Accelerating Image Pre‑processing in Python with a Three‑Line Multiprocessing Trick