Product Management 22 min read

From Zero to One: How a Classical Product Manager Builds AI Companion Apps at Manus

The article follows Zhang Tao, a seasoned product manager turned AI entrepreneur, as he details the creation, technical choices, user feedback, and market challenges of Manus AI's child‑focused drawing companion Dodoboo, offering deep insights into product strategy, AI model selection, and sustainable growth in the AI‑driven content era.

PMTalk Product Manager Community
PMTalk Product Manager Community
PMTalk Product Manager Community
From Zero to One: How a Classical Product Manager Builds AI Companion Apps at Manus

AI companion applications have drawn widespread attention, and Zhang Tao, co‑founder of Manus AI, recently shared the product’s current state in a media interview. Previously a product strategist at ByteDance and a SaaS lead at ShenCe Data and Feishu, Zhang now oversees product strategy, market communication, and user‑experience improvements for Manus AI.

In July 2024, Zhang, chief scientist Ji Yichao (Peak), and serial entrepreneur Xiao Hong (Red) founded Manus AI, with Zhang serving as a partner. Their flagship project, Dodoboo, launched at the end of January 2024 as a child‑oriented drawing‑enhancement app whose name was generated with GPT assistance. The interface lets users draw on the right side while the system instantly generates matching images on the left, requiring no prompt input and fitting children’s cognitive habits.

Zhang Tao's Jike page
Zhang Tao's Jike page

Initially, the team used a popular open‑source image‑annotation model, but it proved outdated for complex image understanding, especially for children’s drawings. After extensive comparative testing with multiple models and fine‑tuning on a child‑doodle dataset, they settled on a newer solution that better captures the nuances of kids’ artwork.

For image generation, they adopted an image‑to‑image pipeline rather than prompt‑based generation because fixed composition and visual flow are essential. To keep latency low, no additional control modules were added; the current setup generates a single image in 0.8–1 seconds on a high‑end GPU, with total response time (including network delay) around 1.5–2 seconds. This performance boost follows the release of LCM (Latent Consistency Model) and ByteDance’s SDXL‑Lightning in early 2024.

Early users, mainly friends, praised the product’s ability to boost children’s confidence, though some parents worried about over‑reliance and reduced creativity. The app attracted introverted children and those on the autism spectrum, who produced 100–200 pieces each, as well as overseas users. Adult users also experimented, sometimes producing artwork superior to the AI‑generated results.

Future plans include a lightweight community gallery where users can publish creations without registration, and an undo button that lets them trace the generation process step‑by‑step, reinforcing a sense of ownership. A forthcoming update will automatically generate videos of the drawing process, leveraging the inherent shareability of video content.

The interview also explored broader industry insights. Zhang argued that AI products must consider profitability from the outset; unlike traditional internet apps, AI services have non‑negligible marginal costs and cannot rely solely on massive DAU growth. He noted that many AI startups focus on showcasing model capabilities rather than solving concrete user problems, and emphasized the importance of aligning model choice with target user needs—high‑end models for niche markets, lighter models for broader audiences.

On AI‑generated content, Zhang highlighted detection challenges: AI‑generated images tend to allocate more detail to rich‑texture regions while neglecting sparse areas, a pattern that can be used for detection. He warned that as generative quality improves, distinguishing AI content may become harder within six months.

Regarding open‑source versus closed‑source models, Zhang believes the debate is less relevant than assessing available resources, capabilities, and user needs. While large‑scale language models require massive data and compute, multimodal open‑source projects can often keep pace with commercial offerings, and some community work even surpasses proprietary models.

Finally, Zhang reflected on the emotional dimension of AI companions. He observed that users form stronger bonds with AI agents that can occasionally refuse requests or exhibit personality, rather than always complying. This “active” behavior, combined with subtle feedback loops, fosters deeper emotional connections, especially for children and the elderly who tolerate model imperfections.

user experienceProduct ManagementGenerative AImarket analysisAI productAI companion
PMTalk Product Manager Community
Written by

PMTalk Product Manager Community

One of China's top product manager communities, gathering 210,000 product managers, operations specialists, designers and other internet professionals; over 800 leading product experts nationwide are signed authors; hosts more than 70 product and growth events each year; all the product manager knowledge you want is right here.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.