Xiaohongshu Tech REDtech
Author

Xiaohongshu Tech REDtech

Official account of the Xiaohongshu tech team, sharing tech innovations and problem insights, advancing together.

115
Articles
0
Likes
324
Views
0
Comments
Recent Articles

Latest from Xiaohongshu Tech REDtech

100 recent articles max
Xiaohongshu Tech REDtech
Xiaohongshu Tech REDtech
Jul 31, 2025 · Artificial Intelligence

How dots.ocr Achieves SOTA Multilingual Document Parsing with a 1.7B VLM

dots.ocr is a 1.7 billion-parameter multilingual document-parsing model that unifies layout detection and content recognition within a single visual-language model, delivering state-of-the-art performance across text, tables, formulas and reading order while remaining efficient and extensible for future multimodal AI research.

AIDocument ParsingOCR
0 likes · 10 min read
How dots.ocr Achieves SOTA Multilingual Document Parsing with a 1.7B VLM
Xiaohongshu Tech REDtech
Xiaohongshu Tech REDtech
Jul 24, 2025 · Backend Development

How Xiaohongshu Boosted Java Performance by 10% with a RedJDK Upgrade

Xiaohongshu’s middleware team migrated thousands of Java services from JDK 8 to RedJDK 11/17, achieving over 10% performance gains, 50% GC pause reduction, and eliminating OOM crashes through systematic JDK upgrades, GC tuning, native‑memory improvements, and standardized deployment pipelines.

GC optimizationNative MemoryVirtual Threads
0 likes · 22 min read
How Xiaohongshu Boosted Java Performance by 10% with a RedJDK Upgrade
Xiaohongshu Tech REDtech
Xiaohongshu Tech REDtech
Jun 19, 2025 · Artificial Intelligence

Can Adaptive Chain‑of‑Thought Learning Halve LLM Thinking Time?

The article introduces the Think When You Need (TWYN) method, a reinforcement‑learning approach that dynamically adapts chain‑of‑thought length, dramatically cuts redundant token generation in large language models, and maintains or improves accuracy across diverse reasoning benchmarks.

adaptive inferencechain of thoughtefficiency
0 likes · 9 min read
Can Adaptive Chain‑of‑Thought Learning Halve LLM Thinking Time?
Xiaohongshu Tech REDtech
Xiaohongshu Tech REDtech
Jun 6, 2025 · Artificial Intelligence

How dots.llm1 Sets New Benchmarks for Open‑Source MoE Language Models

dots.llm1, an open‑source 142‑billion‑parameter Mixture‑of‑Experts language model from hi lab, achieves Qwen2.5‑72B‑level performance after training on 11.2 T high‑quality tokens, and the release includes full models, intermediate checkpoints, and detailed training pipelines for the research community.

AI researchLarge Language ModelMixture of Experts
0 likes · 10 min read
How dots.llm1 Sets New Benchmarks for Open‑Source MoE Language Models
Xiaohongshu Tech REDtech
Xiaohongshu Tech REDtech
Jun 4, 2025 · Artificial Intelligence

From Sub-Ability Diagnosis to Human-Aligned Generation: Bridging the Gap for Text Length Control via MARKERGEN

MarkerGen introduces a novel, plug‑and‑play framework that decomposes length‑controllable text generation into four sub‑abilities—identifying, counting, planning, and aligning—integrates external tokenizers and dynamic markers, and achieves significantly lower length errors and higher quality across diverse models, tasks, and languages.

LLMLength-Controlled GenerationMarkerGen
0 likes · 14 min read
From Sub-Ability Diagnosis to Human-Aligned Generation: Bridging the Gap for Text Length Control via MARKERGEN
Xiaohongshu Tech REDtech
Xiaohongshu Tech REDtech
Jun 3, 2025 · Artificial Intelligence

Beyond One-Size-Fits-All: Tailored Benchmarks for Efficient Evaluation

The TailoredBench framework dramatically reduces large‑language‑model evaluation cost and error by using a global probe set, model‑specific source selection, extensible K‑Medoids clustering, and calibration, achieving up to 300× speedup and a 31.4% MAE reduction across diverse benchmarks.

AI researchK-MedoidsLLM evaluation
0 likes · 10 min read
Beyond One-Size-Fits-All: Tailored Benchmarks for Efficient Evaluation
Xiaohongshu Tech REDtech
Xiaohongshu Tech REDtech
May 22, 2025 · Artificial Intelligence

Scalable Overload-Aware Graph-Based Index Construction for 10‑Billion‑Scale Vector Similarity Search (SOGAIC)

The paper introduces SOGAIC, a scalable overload‑aware graph‑based index construction system for billion‑scale vector similarity search that uses adaptive overlapping partitioning and load‑balanced distributed scheduling to cut construction time by 47.3% while maintaining high recall.

ANNdistributed schedulinggraph index
0 likes · 13 min read
Scalable Overload-Aware Graph-Based Index Construction for 10‑Billion‑Scale Vector Similarity Search (SOGAIC)
Xiaohongshu Tech REDtech
Xiaohongshu Tech REDtech
May 19, 2025 · Industry Insights

How Xiaohongshu Built a Minute‑Level Near‑Real‑Time Data Warehouse with Incremental Computing

Facing billions of daily logs and the need for minute‑level experiment metrics, Xiaohongshu partnered with Yunqi Tech to design a generic incremental‑compute solution that delivers near‑real‑time data warehousing with lower cost, higher accuracy, simplified pipelines, and improved query performance.

FlinkIcebergPaimon
0 likes · 24 min read
How Xiaohongshu Built a Minute‑Level Near‑Real‑Time Data Warehouse with Incremental Computing
Xiaohongshu Tech REDtech
Xiaohongshu Tech REDtech
Mar 13, 2025 · Artificial Intelligence

UniCBE: A Unified Multi‑Objective Optimization Framework for Contrastive Based Evaluation

UniCBE introduces a unified multi‑objective optimization framework for contrastive‑based evaluation that mitigates sampling bias, unbalanced uncertainty reduction, and inefficient resource allocation by combining three decoupled probability matrices through a greedy and Hadamard‑product strategy, achieving Pearson correlations above 0.995 with only 83 % of the annotation budget and cutting evaluation costs by more than 50 % across diverse LLM evaluators.

Contrastive EvaluationSampling Biasefficiency
0 likes · 10 min read
UniCBE: A Unified Multi‑Objective Optimization Framework for Contrastive Based Evaluation
Xiaohongshu Tech REDtech
Xiaohongshu Tech REDtech
Mar 6, 2025 · Backend Development

ROFF: A High‑Performance Seven‑Layer Rust‑Based Gateway with TLS Offload, QUIC/HTTP3, and Dynamic Module System

ROFF is a Rust‑implemented, seven‑layer gateway that delivers high‑throughput load balancing with memory‑safe performance, TLS hardware offload, native QUIC/HTTP3 support, a hot‑reload/upgrade mechanism, and an extensible module system allowing over thirty built‑in filters and custom Rust macros.

HTTP/3Load BalancingModule System
0 likes · 28 min read
ROFF: A High‑Performance Seven‑Layer Rust‑Based Gateway with TLS Offload, QUIC/HTTP3, and Dynamic Module System