Tag

Feature Engineering

1 views collected around this technical thread.

JD Retail Technology
JD Retail Technology
Jun 10, 2025 · Artificial Intelligence

How JD Builds a Scalable AI‑Powered Recommendation Data System with Flink

This article explains JD's complex recommendation system data pipeline—from indexing, sampling, and feature engineering to explainability and real‑time metrics—highlighting challenges such as data consistency, latency, and the use of Flink for massive, low‑latency processing.

Big DataFeature EngineeringFlink
0 likes · 23 min read
How JD Builds a Scalable AI‑Powered Recommendation Data System with Flink
Architect
Architect
May 31, 2025 · Artificial Intelligence

Edge Intelligence Implementation in the Vivo Official App: Architecture, Feature Engineering, and Model Deployment

The article details how edge intelligence is applied to the Vivo official app to improve product recommendation on the smart‑hardware floor by abstracting the problem, designing feature engineering pipelines, training TensorFlow models, converting them to TFLite, and deploying inference on mobile devices, while also covering monitoring and performance considerations.

Feature EngineeringModel DeploymentTensorFlow Lite
0 likes · 19 min read
Edge Intelligence Implementation in the Vivo Official App: Architecture, Feature Engineering, and Model Deployment
Python Programming Learning Circle
Python Programming Learning Circle
Mar 24, 2025 · Artificial Intelligence

Comprehensive List of Aggregation Functions and Custom Feature Engineering Utilities for Python

This article presents a detailed collection of built‑in pandas aggregation methods and numerous custom Python functions for time‑series feature engineering, offering beginners practical tools to enhance data preprocessing and model performance in machine‑learning projects.

Feature Engineeringaggregation functionsdata science
0 likes · 10 min read
Comprehensive List of Aggregation Functions and Custom Feature Engineering Utilities for Python
Cognitive Technology Team
Cognitive Technology Team
Mar 17, 2025 · Artificial Intelligence

Leveraging Large Language Models to Optimize Traditional Machine Learning Pipelines

Large language models can assist and enhance each stage of traditional machine learning—including sample generation, data cleaning, feature engineering, model selection, hyper‑parameter tuning, and workflow automation—by generating synthetic data, refining features, selecting models, and orchestrating pipelines, though challenges such as bias, privacy, and noise remain.

Data GenerationFeature EngineeringHyperparameter Optimization
0 likes · 11 min read
Leveraging Large Language Models to Optimize Traditional Machine Learning Pipelines
Cognitive Technology Team
Cognitive Technology Team
Mar 6, 2025 · Artificial Intelligence

From Traditional Machine Learning to Deep Learning: A Comprehensive Guide to Algorithms, Feature Engineering, and Model Training

This article provides a step‑by‑step tutorial that walks readers through the fundamentals of traditional machine‑learning algorithms, feature‑engineering techniques, model training pipelines, evaluation metrics, and then advances to deep‑learning concepts such as MLPs, activation functions, transformers, and modern recommendation‑system models.

Feature EngineeringPythonRecommendation systems
0 likes · 63 min read
From Traditional Machine Learning to Deep Learning: A Comprehensive Guide to Algorithms, Feature Engineering, and Model Training
Airbnb Technology Team
Airbnb Technology Team
Jan 24, 2025 · Artificial Intelligence

Chronon — An Open-Source Framework for Production-Level Feature Engineering in Machine Learning

Chronon is an open‑source framework that centralizes feature definitions to guarantee training‑inference consistency, eliminates complex ETL pipelines, and supports real‑time and batch processing across diverse data sources, cutting feature‑development cycles from months to under a week, as demonstrated by Airbnb’s 40,000‑feature deployment.

ChrononFeature EngineeringHive
0 likes · 10 min read
Chronon — An Open-Source Framework for Production-Level Feature Engineering in Machine Learning
Bilibili Tech
Bilibili Tech
Dec 27, 2024 · Big Data

Consistency Architecture for Bilibili Recommendation Model Data Flow

The article outlines Bilibili’s revamped recommendation data‑flow architecture that eliminates timing and calculation inconsistencies by snapshotting online features, unifying feature computation in a single C++ library accessed via JNI, and orchestrating label‑join and sample extraction through near‑line Kafka/Flink pipelines, with further performance gains and Iceberg‑based future extensions.

Big DataFeature EngineeringFlink
0 likes · 12 min read
Consistency Architecture for Bilibili Recommendation Model Data Flow
Tencent Advertising Technology
Tencent Advertising Technology
Dec 6, 2024 · Big Data

Building a High‑Performance Advertising Feature Data Lake with Apache Iceberg at Tencent

Tencent's advertising team replaced a traditional HDFS‑Hive warehouse with an Apache Iceberg‑based data lake, adding primary‑key tables, multi‑stream merging, adaptive compaction, and Spark SPJ optimizations to achieve minute‑level feature update latency, 10× back‑fill speed, and up to 60% storage savings.

Big DataCDCCompaction
0 likes · 25 min read
Building a High‑Performance Advertising Feature Data Lake with Apache Iceberg at Tencent
Test Development Learning Exchange
Test Development Learning Exchange
Nov 26, 2024 · Artificial Intelligence

Comprehensive Python Tutorial for Data Preprocessing, Feature Engineering, Model Training, Evaluation, and Deployment

This tutorial walks through consolidating the first ten days of learning by covering data preprocessing, feature engineering, model training with linear regression, decision tree, and random forest, model evaluation using cross‑validation, and finally saving and loading the best model, all illustrated with complete Python code examples.

Feature EngineeringPythondata preprocessing
0 likes · 9 min read
Comprehensive Python Tutorial for Data Preprocessing, Feature Engineering, Model Training, Evaluation, and Deployment
Test Development Learning Exchange
Test Development Learning Exchange
Nov 22, 2024 · Artificial Intelligence

Feature Selection and Feature Engineering with Python (Filter, Wrapper, and Embedded Methods)

This tutorial teaches how to perform feature selection using filter, wrapper, and embedded methods and how to construct new features such as interaction, non‑linear, binned, and binary features with Python's pandas and scikit‑learn libraries.

Feature EngineeringPythonfeature selection
0 likes · 7 min read
Feature Selection and Feature Engineering with Python (Filter, Wrapper, and Embedded Methods)
DataFunSummit
DataFunSummit
Nov 20, 2024 · Artificial Intelligence

How Data Lakes Empower AI: Expert Insights on Feature Management, Columnar Storage, and Vector Formats

In a panel discussion, experts explain how data‑lake‑warehouse integration, columnar formats like Apache Iceberg, and emerging variant types enable efficient feature engineering, support large‑language‑model workloads, and provide flexible vector storage, thereby driving the evolution of AI from traditional ML to the GenAI era.

Apache IcebergArtificial IntelligenceFeature Engineering
0 likes · 6 min read
How Data Lakes Empower AI: Expert Insights on Feature Management, Columnar Storage, and Vector Formats
DataFunTalk
DataFunTalk
Nov 6, 2024 · Big Data

How Data Lakes Empower AI: Insights from Industry Experts

In a panel discussion, experts from Kuaishou, Ping An, and Datastrato explain how data lake architectures, columnar storage formats like Apache Iceberg, and vector‑enabled lake formats are enhancing feature management, supporting generative AI workloads, and accelerating machine‑learning pipelines.

AIApache IcebergBig Data
0 likes · 6 min read
How Data Lakes Empower AI: Insights from Industry Experts
Test Development Learning Exchange
Test Development Learning Exchange
Oct 28, 2024 · Big Data

Data Preprocessing with Pandas: A Comprehensive Guide

This article provides a comprehensive guide to data preprocessing using Pandas, covering essential steps like data cleaning, feature engineering, and data transformation for machine learning projects.

Categorical EncodingDataset SplittingFeature Engineering
0 likes · 5 min read
Data Preprocessing with Pandas: A Comprehensive Guide
DataFunSummit
DataFunSummit
Oct 11, 2024 · Artificial Intelligence

Feature Production and Component Modeling in the Intelligent Era: From Feature Generation to Modular Modeling

This article introduces a cloud‑based feature production platform that simplifies feature engineering for recommendation, risk control and machine learning, explains its component‑based modeling framework, and answers common questions about deployment, performance, and customization, highlighting cross‑platform compatibility and optimization techniques.

Artificial IntelligenceBig DataFeature Engineering
0 likes · 19 min read
Feature Production and Component Modeling in the Intelligent Era: From Feature Generation to Modular Modeling
Python Programming Learning Circle
Python Programming Learning Circle
Sep 10, 2024 · Artificial Intelligence

Time Series Feature Engineering Techniques in Python

This article explains how to extract a variety of date‑time based features—including date, time, lag, rolling, expanding, and domain‑specific attributes—from a time‑series dataset using pandas, and discusses proper validation strategies for building reliable forecasting models.

Feature EngineeringPythonforecasting
0 likes · 14 min read
Time Series Feature Engineering Techniques in Python
DataFunTalk
DataFunTalk
Jul 27, 2024 · Information Security

Classification of Risk Control and Full-Scenario Anti-Cheat Strategies in the Internet

The article outlines how internet and financial risk control are categorized into anti‑cheat, anti‑fraud, and content security, describes full‑scenario cheating types, and presents a three‑step joint defense framework using perception, identification, and mitigation with feature‑based analysis.

Feature Engineeringanti-cheatfraud detection
0 likes · 7 min read
Classification of Risk Control and Full-Scenario Anti-Cheat Strategies in the Internet
iQIYI Technical Product Team
iQIYI Technical Product Team
Jul 5, 2024 · Big Data

RiskFactor: An Integrated Real‑Time and Offline Feature Platform for Risk Control

RiskFactor unifies iQIYI’s legacy real‑time and offline feature platforms onto Opal’s DAG‑plus‑SQL engine, accelerating feature production fifteen‑fold, cutting latency from hours to minutes, streamlining development, lowering costs, and delivering more reliable, versioned risk‑control capabilities against sophisticated online threats.

Big DataDAGFeature Engineering
0 likes · 14 min read
RiskFactor: An Integrated Real‑Time and Offline Feature Platform for Risk Control
Python Programming Learning Circle
Python Programming Learning Circle
Jun 21, 2024 · Artificial Intelligence

Using scikit-learn for Data Mining: Feature Engineering, Parallel Processing, Pipelines, and Model Persistence

This article demonstrates how to perform data mining with scikit-learn by detailing the full workflow—from data acquisition and feature engineering, through parallel and pipeline processing, to automated hyper‑parameter tuning and model persistence—using the Iris dataset as an example.

Data MiningFeature EngineeringParallel Processing
0 likes · 13 min read
Using scikit-learn for Data Mining: Feature Engineering, Parallel Processing, Pipelines, and Model Persistence
DataFunTalk
DataFunTalk
Jun 13, 2024 · Artificial Intelligence

A/B Testing and Model Grayscale in Credit Risk Control: Concepts, Requirements, and Integrated Solutions

This article explains how A/B testing and model grayscale are applied in credit risk control, discusses the specific requirements for effective testing, compares upstream and risk‑system traffic splitting methods, and proposes an integrated all‑in‑one solution to simplify feature engineering, model evaluation, and deployment.

A/B testingFeature Engineeringcredit risk
0 likes · 5 min read
A/B Testing and Model Grayscale in Credit Risk Control: Concepts, Requirements, and Integrated Solutions
DataFunTalk
DataFunTalk
Jun 11, 2024 · Artificial Intelligence

Intelligent Risk Control: Concepts, Challenges, and Integrated Operational Architecture for Banking

This article explores the concept of intelligent risk control in banking, detailing its AI‑driven architecture, current challenges such as external data costs and model‑deployment friction, and proposes an integrated operational framework that leverages big data, knowledge graphs, and MLOps to enhance risk detection and decision‑making.

Artificial IntelligenceBig DataFeature Engineering
0 likes · 14 min read
Intelligent Risk Control: Concepts, Challenges, and Integrated Operational Architecture for Banking