How Do Generative, Perceptual, and Decision AI Interact? Insights from Jina AI’s Founder
In this interview, Jina AI’s founder Shao Han examines the relationships among generative, perceptual, and decision AI, compares single‑modal and multimodal approaches, discusses large language model development, and evaluates the impact of ChatGPT on search and future AI commercialization.
Core Relationship of Generative, Perceptual, and Decision AI & Future Trends
Decision‑oriented AI applies rules to existing content for classification, recommendation, filtering, or extraction. It dominated applications such as speech and facial recognition from 2010‑2020. Generative AI creates new content (text, image, audio, video) from prompts. Early generative models produced 16×16 pixel images; milestones include DALL·E 1 (2020), DALL·E 2 (2021), Stable Diffusion (2022) and ChatGPT (Nov 2022). Perceptual AI is less commonly treated as a separate category.
Single‑Modal vs. Multimodal AI
Before 2020 most commercial AI was single‑modal, where input and output share the same modality (e.g., image classification). Post‑2020, the explosion of heterogeneous data (text, audio, video) and advances in deep‑learning frameworks (TensorFlow, PyTorch) enabled models that can index, search, and generate across modalities. The key drivers are larger, more diverse datasets and scalable model architectures.
Large‑Model Development: Industry vs. Academia
Large language models (LLMs) function as massive knowledge bases. Models such as ChatGPT are trained on balanced multilingual data, making them applicable to Chinese‑language tasks despite not being specifically tuned for China. Scaling improves emergent reasoning and in‑context learning (ICL); reinforcement learning from human feedback (RLHF) is increasingly applied to refine behavior.
Commercialization of Domestic AIGC Companies
After the release of Stable Diffusion (2022) many AIGC startups emerged. Example: Jina AI’s ChatGPT‑based decision‑support tool Rationale.jina.ai reached 100 k monthly active users within a month and secured paying subscribers, illustrating rapid consumer‑oriented growth. The primary barrier for Chinese‑language AIGC remains the absence of a high‑quality, stable Chinese‑focused GPT model.
Downstream Applications and Prompt Engineering
LLMs provide the “gold” of knowledge; interfaces such as ChatGPT act as the “shovel” that makes the gold usable in products, improving user productivity. Early consumer applications resemble knowledge‑base tools (e.g., Notion‑style). Effective prompt engineering is essential to translate model capabilities into practical outputs, requiring upfront effort to craft high‑quality prompts.
Impact on Search
ChatGPT can be viewed as a probabilistic database that returns the most likely answer rather than a ranked list of documents. This conversational retrieval can reduce the number of steps a user takes to obtain information compared with traditional search engines. Future search may rely on LLMs, requiring SEO strategies to focus on influencing model outputs through context learning and targeted prompting rather than link‑building.
Economic Prospects of AI‑Generated Content
LLM‑generated content lowers production costs and raises marginal returns but lacks true creativity, as it interpolates existing knowledge. Widespread adoption may shift media economics toward cost efficiency, with limited potential for breakthrough innovation.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITPUB
Official ITPUB account sharing technical insights, community news, and exciting events.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
