How Imperfect AI Can Unlock the Hidden 80% of Enterprise Data
Enterprises face a sharp paradox: despite exploding data volumes, only about 20% of structured data is used while the remaining 80% of unstructured data stays frozen, and this talk explores how Data Agent‑powered imperfect AI can awaken that hidden value.
Most enterprises face a sharp paradox: in daily operations data volume skyrockets, yet truly valuable data remains "frozen".
The contradiction stems from two reasons: many companies lack business‑aligned data analysis products, and in the Agentic AI era some still think only "1,2,3,4,5" are usable data.
Data can be divided into "structured data" (tabular, fixed schema) and "unstructured/semistructured data" (documents, images, audio, video, etc.). Historically, businesses relied solely on structured data analysis for decisions, which accounts for only 20% of total data assets.
How can the remaining 80% be leveraged? How can unstructured data be used in business?
In a recent FORCE Power Conference in Beijing, Hai Shushan, head of Volcano Engine’s Data Agent project, shared insights.
"Imperfect intelligence" refers to large models that are not perfect—they have mathematical flaws, hallucinations, and need human guidance. Humans are also imperfect: biased, forgetful, and emotional. The key is whether the intelligence can evolve.
Typical client questions illustrate the problem: for a 618 promotion, why does my effect lag competitors despite identical data and budget? Banks invest heavily in data platforms yet claim they can't understand the data. How to turn identical research data into actionable marketing strategies?
Data volume and tools improve, but data value remains unreleased. The core issue is what blocks data value release.
First, unstructured data (the 80% iceberg base) needs awakening. Second, even the 20% structured data suffers from uneven interpretation skills—different people see numbers, trends, or causal patterns, and expert training is costly.
Third, tools are often command‑driven; asking a good question is scarce, and answering the wrong question is riskier than not answering at all.
Finally, collaboration suffers: market teams raise issues, analysts see numbers, managers get conclusions, and information gets distorted across layers.
These three points highlight the need for data to think proactively and become a data‑intelligent agent.
Data Agent expands data scope to structured, unstructured, and future public data, addressing the limitations of generic agents. An Agent is envisioned as an entity that thinks, analyzes, and evolves—like hiring a smart employee whose intelligence level is defined by large‑model capability.
As large models evolve, the Agent becomes smarter, absorbs domain knowledge, and can translate analysis into marketing actions, bridging data and business outcomes.
In practice, Data Agent can answer questions like a bank’s loan decline by aggregating structured sales data, unstructured documents, call recordings, and external industry trends, identifying patterns, root causes, and opportunities that single‑source analysis cannot reveal.
Marketing Agent uses the same engine to fuse purchase, behavior, and social media data, precisely target high‑potential users, generate personalized 1‑to‑1 content, and continuously iterate strategies, dramatically improving conversion.
Beyond efficiency, Data Agent enables deep research: by collecting data points, it produces initial reports, uncovers insights, and transforms intuition‑driven decisions into scientific ones, fostering a learning‑oriented organization.
Current trends—Deepseek release, reduced model costs, and mature enterprise AI adoption—make now an optimal time to experiment with Data Agent, as many firms still lack effective implementation despite the technology’s availability.
In summary, Data Agent does not replace data platforms but enhances them, allowing natural‑language interaction to extract value, accelerate decision‑making, and elevate both individual and team capabilities toward a truly data‑driven enterprise.
ByteDance Data Platform
The ByteDance Data Platform team empowers all ByteDance business lines by lowering data‑application barriers, aiming to build data‑driven intelligent enterprises, enable digital transformation across industries, and create greater social value. Internally it supports most ByteDance units; externally it delivers data‑intelligence products under the Volcano Engine brand to enterprise customers.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.