Tagged articles
3 articles
Page 1 of 1
Machine Learning Algorithms & Natural Language Processing
Machine Learning Algorithms & Natural Language Processing
Feb 11, 2026 · Artificial Intelligence

Breaking the Data Ceiling: UltraData’s 2.4 TB Tiered Dataset with the Largest L3 Math Library

UltraData presents a five‑level tiered data‑management system (L0‑L4) for large‑language‑model training, releases the world’s largest open L3 mathematics dataset (2.4 TB), validates the approach with extensive MiniCPM‑1.2B experiments showing consistent performance gains across web, multilingual, math and code domains, and opens a suite of governance tools and a community portal.

Data GovernanceMathematics DatasetMiniCPM
0 likes · 15 min read
Breaking the Data Ceiling: UltraData’s 2.4 TB Tiered Dataset with the Largest L3 Math Library
IT Services Circle
IT Services Circle
Jun 9, 2024 · Artificial Intelligence

Plagiarism Allegations Between Stanford's Llama3‑V and China's MiniCPM‑Llama3‑V 2.5 Model

The article details the controversy surrounding Stanford's Llama3‑V team admitting to copying the architecture and code of the Chinese MiniCPM‑Llama3‑V 2.5 model, presents new evidence of weight similarity, compares performance metrics, and discusses broader concerns about the recognition of Chinese AI research in the open‑source community.

AI ethicsLlama3-VMiniCPM
0 likes · 9 min read
Plagiarism Allegations Between Stanford's Llama3‑V and China's MiniCPM‑Llama3‑V 2.5 Model