Tagged articles
276 articles
Page 2 of 3
AntTech
AntTech
Apr 26, 2024 · Databases

Data Processing Technologies in the AI Era: Trends and Integration of Vector and Relational Databases

The talk explores how the rapid growth of multimodal data and large language models is reshaping data processing, highlighting three key trends—online‑offline integration, vector‑relational database convergence, and the fusion of data processing with AI computation—while presenting practical solutions and future visions for unified data‑AI ecosystems.

AIBig DataHTAP
0 likes · 12 min read
Data Processing Technologies in the AI Era: Trends and Integration of Vector and Relational Databases
Go Development Architecture Practice
Go Development Architecture Practice
Mar 21, 2024 · Backend Development

How to Process One Billion Rows in Go: 9 Optimized Solutions Under 4 Seconds

This article walks through nine Go‑based implementations for the 1‑Billion‑Row Challenge, starting from a straightforward scanner approach and progressively applying map pointer values, custom parsing, integer arithmetic, buffer tweaks, custom hash tables, and parallelism to shrink processing time from 1 minute 45 seconds to under 4 seconds.

1BRCGoParallelism
0 likes · 22 min read
How to Process One Billion Rows in Go: 9 Optimized Solutions Under 4 Seconds
DataFunSummit
DataFunSummit
Mar 13, 2024 · Artificial Intelligence

Overview of Vivo BlueLM: Evolution, Training Challenges, Deployment, and Product Applications

This article presents a comprehensive overview of Vivo's BlueLM large language model, covering its historical evolution, training pipeline and data challenges, algorithmic innovations, safety measures, edge‑device optimization, product deployments such as BlueLM Mini‑V and BlueQianXun, and insights from a detailed Q&A session.

AI productEdge ComputingModel Training
0 likes · 17 min read
Overview of Vivo BlueLM: Evolution, Training Challenges, Deployment, and Product Applications
Python Crawling & Data Mining
Python Crawling & Data Mining
Feb 27, 2024 · Fundamentals

Automate Multi‑Sheet Excel Scoring with Python & Pandas: Step‑by‑Step Guide

This article walks through using Python and pandas to batch‑process seven Excel evaluation sheets, skipping header rows, cleaning data, computing total and average scores per person, merging results, and outputting aggregated statistics, providing a practical automation solution for repetitive office tasks.

Batch ProcessingExcel AutomationPython
0 likes · 7 min read
Automate Multi‑Sheet Excel Scoring with Python & Pandas: Step‑by‑Step Guide
Architecture Digest
Architecture Digest
Feb 21, 2024 · Backend Development

Java 8 Stream API Tutorial with PO Example and Common Operations

This article introduces Java 8's Stream API, explains its pipeline concept similar to SQL and Linux pipes, and demonstrates common operations such as filter, map, sorted, forEach, collect, statistics and parallelStream using a UserPo class with complete runnable code examples.

BackendLambdaStream API
0 likes · 9 min read
Java 8 Stream API Tutorial with PO Example and Common Operations
Python Programming Learning Circle
Python Programming Learning Circle
Feb 2, 2024 · Operations

17 Essential Python Scripts for Automating Everyday Tasks

This article presents 17 practical Python scripts covering file management, web scraping, email handling, Excel processing, database interaction, system tasks, image editing, and more, each with code examples and explanations, enabling developers and analysts to automate repetitive workflows and boost productivity across diverse domains.

EmailPythonSystem Administration
0 likes · 26 min read
17 Essential Python Scripts for Automating Everyday Tasks
Sohu Tech Products
Sohu Tech Products
Jan 31, 2024 · Operations

Logstash Grok Filter: Complete Guide for Log Data Parsing and ETL

This guide explains Logstash’s Grok filter plugin, detailing how its 120 built‑in and custom patterns transform unstructured logs—such as Apache, MySQL, or HiveServer2—into structured fields through named regex captures, supporting type conversion, cleaning, debugging, and efficient ETL for analysis and monitoring.

ETLGrok filterLogstash
0 likes · 8 min read
Logstash Grok Filter: Complete Guide for Log Data Parsing and ETL
ITPUB
ITPUB
Dec 25, 2023 · Big Data

Unlock Complex Data Scenarios with Simple MaxCompute SQL Techniques

This article shows how flexible, divergent thinking combined with basic MaxCompute (ODPS) SQL syntax can solve complex data problems such as generating sequences, splitting intervals, performing permutations and combinations, and analyzing continuous activity, providing step‑by‑step examples, SQL code snippets, and practical results.

IntervalsMaxComputeSequences
0 likes · 24 min read
Unlock Complex Data Scenarios with Simple MaxCompute SQL Techniques
21CTO
21CTO
Dec 14, 2023 · Databases

Why esProc SPL Beats SQLite for Lightweight Data Processing

esProc SPL, a pure‑Java, lightweight data‑processing engine, offers richer data source support, built‑in flow control, and easier complex calculations compared to SQLite, making it a powerful alternative for small applications that need database‑like capabilities without the overhead of traditional databases.

SPLSQLitedata-processing
0 likes · 7 min read
Why esProc SPL Beats SQLite for Lightweight Data Processing
DaTaobao Tech
DaTaobao Tech
Nov 29, 2023 · Frontend Development

Error Message Governance System: A Frontend Solution for Better User Experience

The article describes a frontend error‑message governance system that dynamically maps cryptic error codes to clear, context‑aware messages, lets operators configure responses by code, URL or endpoint, and improves user experience and system stability while enabling future features such as efficiency metrics, guides, and surveys.

Error HandlingOperational ConfigurationUser experience
0 likes · 7 min read
Error Message Governance System: A Frontend Solution for Better User Experience
Python Crawling & Data Mining
Python Crawling & Data Mining
Oct 26, 2023 · Fundamentals

Clean Mixed Excel Date Formats in Pandas with Simple Code

This article walks through handling Excel columns that contain both compact (YYYYMMDD) and full timestamp (YYYY-MM-DD HH:MM:SS) date strings in Pandas, showing how to unify formats by removing hyphens, trimming, and converting them to proper datetime objects with concise code.

Date Parsingdata-processing
0 likes · 4 min read
Clean Mixed Excel Date Formats in Pandas with Simple Code
Python Crawling & Data Mining
Python Crawling & Data Mining
Aug 15, 2023 · Backend Development

Merge Multiple Employee Excel Files into One with Python – Step‑by‑Step Guide

Learn how to automate the consolidation of numerous employee performance Excel files using Python and pandas, with a clear code example that reads each workbook, concatenates the data, and outputs a single merged spreadsheet, while also offering tips for handling file formats and avoiding common pitfalls.

automationdata-processingpandas
0 likes · 4 min read
Merge Multiple Employee Excel Files into One with Python – Step‑by‑Step Guide
Architecture Digest
Architecture Digest
Aug 14, 2023 · Backend Development

Java 8 Stream API Tutorial: 20 Practical Examples for Filtering, Mapping, Reducing, and More

This comprehensive tutorial explains Java 8's Stream API and Lambda expressions, demonstrating how to create streams from collections and arrays, perform intermediate operations like filter, map, flatMap, and sorted, and use terminal operations such as forEach, findFirst, reduce, and collect through twenty detailed code examples covering employee data processing, aggregation, grouping, and sorting.

CollectionsJava 8Stream API
0 likes · 21 min read
Java 8 Stream API Tutorial: 20 Practical Examples for Filtering, Mapping, Reducing, and More
Sohu Tech Products
Sohu Tech Products
Apr 26, 2023 · Fundamentals

Understanding JavaScript Generators: Basics, Syntax, and Advanced Usage

JavaScript Generators, introduced in ES6, allow functions to pause and resume execution, yielding multiple values; this article explains their syntax, basic usage, advanced features like yield* and data exchange, and demonstrates practical scenarios such as asynchronous flow control, memory-efficient data processing, and state machine implementation.

AsyncJavaScriptYield
0 likes · 11 min read
Understanding JavaScript Generators: Basics, Syntax, and Advanced Usage
Laravel Tech Community
Laravel Tech Community
Apr 2, 2023 · Backend Development

QueryList: A Modern PHP Content Scraping Library – Features, Installation, and Usage Guide

This article introduces QueryList, a modern PHP content‑scraping tool that uses CSS selectors instead of regex, explains its two versions (V3 and V4), shows how to install it via Composer, demonstrates basic crawling code and various collection methods such as flatten, take, reverse, filter, map, and multi‑request concurrency.

Content ExtractionWeb Scrapingdata-processing
0 likes · 7 min read
QueryList: A Modern PHP Content Scraping Library – Features, Installation, and Usage Guide
Java Captain
Java Captain
Jan 4, 2023 · Databases

Managing Database Intermediate Tables with File Storage Using SPL

The article explains how excessive intermediate tables generated by reporting workloads degrade database storage and performance, and proposes using the SPL data‑processing tool to store these intermediate results as external files, thereby reducing capacity pressure, improving I/O speed, and simplifying management.

SPLdata-processingdatabases
0 likes · 9 min read
Managing Database Intermediate Tables with File Storage Using SPL
Python Programming Learning Circle
Python Programming Learning Circle
Dec 10, 2022 · Fundamentals

Using Python (pandas) to Perform Common Excel Data Processing Tasks

This article demonstrates how to replace typical Excel operations such as VLOOKUP, pivot tables, duplicate removal, missing‑value handling, multi‑condition filtering, fuzzy matching, column splitting, outlier replacement, grouping and labeling with concise Python pandas code to streamline data analysis workflows.

VLOOKUPdata cleaningdata-analysis
0 likes · 9 min read
Using Python (pandas) to Perform Common Excel Data Processing Tasks
DaTaobao Tech
DaTaobao Tech
Nov 23, 2022 · Big Data

Real-time Log Aggregation and Monitoring with Blink (Flink) on Mobile Endpoints

The article explains how Blink, Alibaba’s optimized Flink variant, uses dynamic tables and streaming‑SQL to ingest mobile telemetry via source tables, compute per‑minute metrics such as API success rates with tumbling windows, and write results to Alibaba Cloud Log Service, enabling real‑time dashboards and extensible use cases like fraud detection.

FlinkReal-time Streamingblink
0 likes · 10 min read
Real-time Log Aggregation and Monitoring with Blink (Flink) on Mobile Endpoints
MaGe Linux Operations
MaGe Linux Operations
Nov 21, 2022 · Backend Development

Build a Python-Based Electronic Attendance System: Step-by-Step Guide

This article outlines a student project to create a Python-powered electronic attendance system, detailing required CSV data formats, core functions such as loading data, login, record writing, and querying, along with task requirements, additional features, and complete code examples.

CSVStudent Projectattendance system
0 likes · 8 min read
Build a Python-Based Electronic Attendance System: Step-by-Step Guide
Python Programming Learning Circle
Python Programming Learning Circle
Oct 17, 2022 · Fundamentals

Using openpyxl to Operate Excel Files in Python

This article provides a comprehensive guide to using the openpyxl library for creating, opening, editing, and saving Excel workbooks in Python, covering worksheet management, cell operations, merging, iteration, and efficient handling of large files with code examples.

data-processingopenpyxlworkbook
0 likes · 8 min read
Using openpyxl to Operate Excel Files in Python
JD Tech
JD Tech
Oct 13, 2022 · Frontend Development

Automatic Design Draft Recognition and Floor Generation in the Tongtian Tower Platform

This article describes how the Tongtian Tower platform breaks traditional R&D barriers by enabling zero‑code, one‑click conversion of design drafts into production‑ready floor layouts, detailing the underlying architecture, data processing pipeline, core capabilities, challenges faced, and future optimization plans.

Design AutomationNo-codeUI Generation
0 likes · 10 min read
Automatic Design Draft Recognition and Floor Generation in the Tongtian Tower Platform
Python Programming Learning Circle
Python Programming Learning Circle
Aug 12, 2022 · Fundamentals

Automating Excel Reports with Python xlwings and pandas

This article demonstrates how to replace tedious manual Excel reporting by using Python libraries pandas and xlwings to read multiple sheets, merge data, write the combined DataFrame back to Excel, and apply conditional formatting such as font colors, borders, and cell shading based on statistical thresholds.

Excel AutomationPythondata-processing
0 likes · 10 min read
Automating Excel Reports with Python xlwings and pandas
Bilibili Tech
Bilibili Tech
Jul 23, 2022 · Backend Development

API Gateway Evolution and Engineering Practices; Applying ClickHouse for Massive Data Processing

The talk traces the evolution of API Gateway architectures and the engineering practices—design patterns, deployment strategies, and operational considerations—required for scalable, reliable services, then demonstrates how ClickHouse can be leveraged for massive data workloads, highlighting practical scenarios, performance optimizations, and key lessons learned.

Big DataEngineeringapi-gateway
0 likes · 1 min read
API Gateway Evolution and Engineering Practices; Applying ClickHouse for Massive Data Processing
Big Data Technology Architecture
Big Data Technology Architecture
Jul 2, 2022 · Fundamentals

Indirect Shareholding Ratio Calculation Using Graph Techniques

This article explains how to compute indirect shareholding ratios between companies by generating synthetic relationship data, cleaning and normalizing it with multiprocessing, constructing a weighted directed graph using NetworkX, and applying a matrix‑based algorithm to derive the final ownership matrix.

Pythondata-processinggraph-analysis
0 likes · 7 min read
Indirect Shareholding Ratio Calculation Using Graph Techniques
Python Programming Learning Circle
Python Programming Learning Circle
Jun 27, 2022 · Big Data

Six Common Beginner Mistakes When Using Pandas and How to Avoid Them

This article outlines six typical errors beginners make with Pandas—slow CSV reads, lack of vectorization, improper dtypes, ignoring styling, inefficient CSV saving, and not consulting documentation—and provides faster alternatives, memory‑saving techniques, and best‑practice tips for handling large datasets.

Big DataMemory Optimizationdata-processing
0 likes · 10 min read
Six Common Beginner Mistakes When Using Pandas and How to Avoid Them
Top Architect
Top Architect
Jun 18, 2022 · Big Data

Overview of Data Lakes and the Open SPL Compute Engine

This article explains the concept and challenges of data lakes, describes the “impossible triangle” of storage, compute, and cost, and introduces the open‑source SPL engine that provides multi‑source, file‑based, high‑performance computing to overcome those limitations.

Data LakeSPLcompute engine
0 likes · 13 min read
Overview of Data Lakes and the Open SPL Compute Engine
Python Programming Learning Circle
Python Programming Learning Circle
May 27, 2022 · Fundamentals

Nine Useful JSON Validation and Formatting Tools

This article introduces nine popular tools—including JSONLint, JSONCompare, JTC, ijson, and others—that help developers validate, format, compress, compare, and edit JSON data, providing both online services and IDE plugins for more efficient JSON handling.

IDE pluginsJSONdata-processing
0 likes · 5 min read
Nine Useful JSON Validation and Formatting Tools
Python Programming Learning Circle
Python Programming Learning Circle
May 4, 2022 · Fundamentals

Comprehensive Guide to NumPy Array Operations and Functions

This article provides a detailed tutorial on NumPy array manipulation in Python, covering iteration with np.nditer, reshaping, flattening, transposition, axis swapping, broadcasting, stacking, concatenation, splitting, resizing, appending, inserting, deleting, unique element extraction, string utilities, arithmetic, statistical analysis, sorting, searching, and file I/O, each illustrated with concise code examples.

NumPyPythonTutorial
0 likes · 21 min read
Comprehensive Guide to NumPy Array Operations and Functions
Alibaba Terminal Technology
Alibaba Terminal Technology
Feb 25, 2022 · Mobile Development

How to Simplify Complex Message Client Data Processing in Mobile Apps

This article examines the intricate data processing challenges of a mobile message client, outlines seven key complexities, proposes a unified abstraction layer to reduce code by 60%, and details a modular solution—including MergeDispatcher, Calculator, and DataStructure—that separates computation from data fetching for improved consistency and scalability.

Mobile Developmentabstractionarchitecture
0 likes · 12 min read
How to Simplify Complex Message Client Data Processing in Mobile Apps
DataFunSummit
DataFunSummit
Jan 17, 2022 · Cloud Computing

Serverless Transformation of Baidu Search Middle Platform: Architecture, Challenges, and Benefits

This article details how Baidu's search middle platform migrated from script‑based processing to a serverless business‑framework architecture, outlining the technical challenges, design of data ingestion, processing, scheduling, and control layers, and summarizing the efficiency, cost, and performance gains achieved.

ScalabilitySearch PlatformServerless
0 likes · 16 min read
Serverless Transformation of Baidu Search Middle Platform: Architecture, Challenges, and Benefits
Java Architect Essentials
Java Architect Essentials
Jan 12, 2022 · Backend Development

Master Java Stream API: 20 Real‑World Examples and Best Practices

This tutorial walks through Java 8 Stream fundamentals, covering creation, intermediate operations (filter, map, flatMap, reduce), terminal actions, collectors, grouping, sorting, and parallel streams, illustrated with 20 practical code examples that transform and analyze collections of objects such as employee records.

CollectionsStream APIdata-processing
0 likes · 24 min read
Master Java Stream API: 20 Real‑World Examples and Best Practices
Programmer DD
Programmer DD
Nov 22, 2021 · Backend Development

Master EasyExcel: Fast, Low‑Memory Excel Import/Export in Java

This guide explains how to use Alibaba's EasyExcel library for efficient, low‑memory Excel import and export in Java, covering its core features, common annotations, Maven dependencies, listener implementation, and practical code examples for both HTTP‑based and local file operations.

ExcelExportImport
0 likes · 7 min read
Master EasyExcel: Fast, Low‑Memory Excel Import/Export in Java
IT Architects Alliance
IT Architects Alliance
Nov 19, 2021 · Backend Development

How SPL Transforms Java Data Processing: From CSV to Multi‑JSON with Embedded SQL

This article introduces SPL, an open‑source Java‑embeddable computation library that outperforms traditional embedded databases and DataFrame tools by handling both tabular and nested JSON data, supporting JDBC, SQL‑like queries, multi‑source integration, and persistent .btx files with concise code examples.

Embedded DatabaseJDBCJSON
0 likes · 8 min read
How SPL Transforms Java Data Processing: From CSV to Multi‑JSON with Embedded SQL
macrozheng
macrozheng
Sep 2, 2021 · Backend Development

Mastering Java 8 Stream API: 20 Real‑World Examples

This tutorial walks through Java 8 Stream and Lambda features, explaining stream concepts, intermediate and terminal operations, and demonstrating twenty practical examples—including creation, filtering, mapping, reduction, collection, sorting, and combining—using an employee class to illustrate each operation.

CollectionsJava 8Stream API
0 likes · 27 min read
Mastering Java 8 Stream API: 20 Real‑World Examples
Top Architect
Top Architect
Aug 1, 2021 · Backend Development

Comprehensive Guide to Java 8 Stream API with Practical Examples

This article provides an in‑depth tutorial on Java 8 Stream API, covering its concepts, creation methods, common operations such as filtering, mapping, reducing, collecting, sorting, and grouping, along with numerous runnable code examples that demonstrate how to process collections efficiently using streams.

CollectionsJava 8Lambda
0 likes · 24 min read
Comprehensive Guide to Java 8 Stream API with Practical Examples
Top Architect
Top Architect
Jul 13, 2021 · Backend Development

Introduction to Spring Batch and Its Core Concepts

This article provides a comprehensive overview of Spring Batch, covering its purpose, architecture, core components such as Job, Step, ItemReader/Writer/Processor, chunk processing, skip strategies, and practical guidelines for building robust batch processing solutions in Java.

Batch ProcessingSpring BatchSpring Framework
0 likes · 19 min read
Introduction to Spring Batch and Its Core Concepts
Alibaba Cloud Developer
Alibaba Cloud Developer
Jun 8, 2021 · Artificial Intelligence

Can Low‑Code Bridge the Gap Between Business and AI? Insights on Its Future

The article explores how low‑code platforms can complement traditional algorithm development, enhance collaboration between business users and engineers, and accelerate big‑data and AI initiatives by improving data cleaning, modular design, and feedback loops, while highlighting the trade‑offs of abstraction and flexibility.

AIAlgorithm DevelopmentBig Data
0 likes · 9 min read
Can Low‑Code Bridge the Gap Between Business and AI? Insights on Its Future
MaGe Linux Operations
MaGe Linux Operations
Apr 14, 2021 · Fundamentals

5 Elegant NumPy Functions for Efficient Data Processing

This article introduces five lesser‑known but powerful NumPy functions—reshape with -1, argpartition, clip, extract, and setdiff1d—explaining their behavior, showcasing code examples, and highlighting how they simplify complex data manipulation tasks.

CLIPExtractargpartition
0 likes · 7 min read
5 Elegant NumPy Functions for Efficient Data Processing
FunTester
FunTester
Apr 8, 2021 · Fundamentals

Mastering jq: Advanced Pipes, Functions, and JSON Format Transformations

This tutorial explores jq's advanced capabilities, demonstrating how to combine filters with the pipe operator, use functions like keys, length, select, map, and join, and transform JSON data into new structures and formats through practical command‑line examples.

JSONcommand-linedata-processing
0 likes · 6 min read
Mastering jq: Advanced Pipes, Functions, and JSON Format Transformations
Youzan Coder
Youzan Coder
Apr 7, 2021 · Mobile Development

Design and Implementation of a Mobile App Performance Monitoring System

The article describes a two‑part mobile app performance monitoring system that automatically instruments code to capture method execution times, ANR and frame stalls, then processes, cleans, aggregates, and visualizes the data on a backend platform to generate alerts, trend reports, and guide optimization across versions.

APMPerformance Monitoringdata-processing
0 likes · 11 min read
Design and Implementation of a Mobile App Performance Monitoring System
Amap Tech
Amap Tech
Mar 12, 2021 · Fundamentals

MTA Problem in High‑Precision LiDAR Data and Its Correction Algorithms

The article describes how high‑frequency LiDAR scanners on precision mapping vehicles suffer from Multi‑Time‑Around (MTA) errors—mis‑assigning distant returns to near ranges—and explains both internal laser strategies (continuity assumption and variable‑period emission) and a four‑step neighborhood‑weighting algorithm that reliably corrects these artifacts, restoring accurate point‑clouds for automated map generation.

LiDARMTASensor Data
0 likes · 12 min read
MTA Problem in High‑Precision LiDAR Data and Its Correction Algorithms
Python Crawling & Data Mining
Python Crawling & Data Mining
Mar 9, 2021 · Fundamentals

How to Automate Rainfall Word Reports with Python and Pandas

This article walks through reading monthly rainfall data with pandas, cleaning missing values, calculating rainfall deviations, generating descriptive paragraphs, and rendering a formatted Word report using docxtpl, providing complete code snippets and example outputs for each step.

DocxTemplatePythonautomation
0 likes · 8 min read
How to Automate Rainfall Word Reports with Python and Pandas
Laravel Tech Community
Laravel Tech Community
Feb 28, 2021 · Big Data

Apache Beam 2.28.0 Release Highlights and New Features

Apache Beam 2.28.0 introduces extensive Parquet support, new hash functions in BeamSQL and ZetaSQL, ApproximateDistinct via HLL, enhanced I/O connectors including SpannerIO for Numeric fields, ParquetIO schema support, KafkaTableProvider thrift, HadoopFormatIO key/value cloning skip, and various other improvements.

Apache BeamBatchBig Data
0 likes · 3 min read
Apache Beam 2.28.0 Release Highlights and New Features
php Courses
php Courses
Jan 26, 2021 · Backend Development

Reading and Writing CSV Files in PHP

This article provides PHP code examples for reading data from a CSV file and writing data to a new CSV file, including handling of locale settings, skipping headers, constructing headers, and appending rows using built‑in functions such as fgetcsv, fopen, and fwrite.

BackendCSVFile I/O
0 likes · 2 min read
Reading and Writing CSV Files in PHP
MaGe Linux Operations
MaGe Linux Operations
Dec 19, 2020 · Backend Development

12 Must‑Know Open‑Source Python Frameworks for Web and Data Development

This article introduces twelve popular open‑source Python frameworks—including Django, Tornado, Twisted, Pulsar, Bottle, Diesel, NumPy, Scrapy, Cubes, Falcon, Web2py, and Zerorpc—detailing their key features, typical use cases, and providing direct project URLs for developers seeking robust solutions.

PythonWeb Developmentdata-processing
0 likes · 8 min read
12 Must‑Know Open‑Source Python Frameworks for Web and Data Development