Artificial Intelligence 16 min read

Youth AI Technology Salon: Multimodal Learning, AIGC, and Career Guidance

At the REDtech Youth AI Technology Salon in Beijing, leading AI experts and top university students discussed the evolution of multimodal learning, Xiaohongshu’s practical applications, autonomous‑driving perception, and offered career guidance, emphasizing solid fundamentals, user value, and opportunities within Xiaohongshu’s talent‑development programs.

Xiaohongshu Tech REDtech

Nov 25, 2022

Youth AI Technology Salon: Multimodal Learning, AIGC, and Career Guidance

On November 19, 2023, the ONEPAGE bookstore in Beijing hosted the REDtech Youth Technology Salon organized by Xiaohongshu. The event gathered students from top universities and AI experts to discuss multimodal technology, AIGC, and career development for young talent.

Key speakers included Zhang Debing, head of Xiaohongshu’s Community Multimedia Intelligent Algorithm team; Zhang Zhaoxiang, researcher and doctoral supervisor at the Chinese Academy of Sciences Institute of Automation; Cao Yue, researcher at the Beijing Academy of Artificial Intelligence; Huang Hua, professor at Beijing Normal University’s AI Institute; and Feng Di, Vice President of Technology at Xiaohongshu.

The first session, “The Evolution of Multimodal Learning,” presented by Cao Yue, traced the history of multimodal AI from separate CNN and Transformer architectures to unified models. He highlighted seminal works such as Swin Transformer, BEiT, SimMIM, VL‑BERT, and discussed the shift toward universal pre‑training that can handle text, image, video, and audio without task‑specific fine‑tuning.

The second session, “Xiaohongshu’s Multimodal Practice,” showcased how the company applies multimodal AI to empower user‑generated content. Zhang Debing described a pipeline that combines visual segmentation, audio synthesis (ASR, TTS, music generation), and cross‑modal matching to provide smart templates, one‑click video creation, and automated content recommendation. He also mentioned large‑scale Chinese multimodal pre‑training models built on the platform’s massive image‑text corpus.

The third session explored “Multimodal Perception for Autonomous Driving.” Professor Zhang Zhaoxiang introduced his TriNet (Trident Network) for multi‑scale object detection and point‑level supervision for panoptic segmentation, as well as a spatial‑sparse transformer architecture that improves detection efficiency.

In the final round‑table, Huang Hua and Feng Di offered career advice for young researchers. They emphasized that chasing hot topics is unnecessary; instead, students should focus on user value, solid fundamentals, and bridging theory with real‑world problems. They also outlined Xiaohongshu’s talent‑development programs, including the “Mentor” system, internal technical courses, paper reading clubs, and 2023 campus recruitment.

The salon concluded with a call for AI‑focused graduates to join Xiaohongshu’s fast‑growing team, highlighting the company’s rich multimodal data, high‑performance computing resources, and supportive growth pathways.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Multimodal AI Technology Salon AIGC AI talent development Career Guidance research insights

Written by

Xiaohongshu Tech REDtech

Official account of the Xiaohongshu tech team, sharing tech innovations and problem insights, advancing together.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.