Industry Insights 8 min read

Multimodal Perception and AI Fusion: Highlights from Tsinghua’s 9th Big Data Intelligent Lecture

The 9th Tsinghua Big Data Intelligent Lecture gathered leading scholars and industry experts to showcase cutting‑edge research on multimodal perception, embodied intelligence, spatial AI, large‑model multimodal systems, and industrial time‑series databases, emphasizing their technical depth and real‑world impact.

Data Party THU

Apr 26, 2026

Multimodal Perception and AI Fusion: Highlights from Tsinghua’s 9th Big Data Intelligent Lecture

On April 22, 2026, the National Engineering Research Center for Big Data System Software and the Multi‑Domain Situational Awareness Committee co‑hosted the 9th Tsinghua Big Data Intelligent Lecture at Tsinghua University, featuring senior leaders from academia, research institutes, and industry.

In his opening remarks, Secretary‑General Liu Yuchao highlighted the association’s focus on civilian‑military integration, emphasizing that multimodal perception combined with artificial intelligence is the core proposition for practical intelligent systems.

Five invited speakers presented on five frontier topics:

Embodied Intelligence : Professor Zhu Xiangwei (Sun Yat‑sen University) demonstrated a bicycle‑level autonomous navigation system built on model‑free deep reinforcement learning, achieving superior terrain adaptability over traditional PID control and exploring neuromorphic, in‑memory computing chips for low‑power autonomous driving.

Maritime Situational Awareness : Researcher Liu Hao (China Shipbuilding 709 Institute) described a four‑layer architecture—perception, understanding, prediction, evaluation—addressing challenges such as multi‑target tracking, moving platforms, strong clutter, and weak observations, and outlined the evolution from data fusion to large‑model applications for ocean security.

Multimodal Large Models : Researcher Yao Cong (Zhipu AI) introduced GLM5V‑Turbo, a native multimodal foundation model that unifies visual, textual, and tool‑calling signals, supports 200K context length, and enables complex tasks like web page generation, PPT creation, stock analysis, and multimodal research.

Spatial Intelligence in Multimodal Models : Researcher Yang Lei (SenseTime Research) presented SenseNova‑SI, which leverages scale‑effect techniques to endow multimodal models with six spatial capabilities—metric reasoning, view transformation, spatial relations—significantly improving 3‑D cognition, viewpoint change, and navigation tasks.

Industrial Time‑Series Data Management : CTO Qiao Jialin (Tianmu Technology) described AIoTDB, an industrial‑grade time‑series database offering high‑throughput ingestion, storage, and analytics, already deployed in new‑energy and smart‑manufacturing scenarios to accelerate digital transformation.

Professor Wang Jianmin concluded the session by praising the depth of the presentations and noting that the event coincided with Tsinghua’s 115th anniversary, serving as a high‑level academic exchange and a platform for industry‑academia‑research collaboration.

The lecture attracted nearly 50 on‑site experts and over 5,000 online viewers, demonstrating strong academic influence and industry reach. Organizers pledged to continue hosting regular talks on AI, big data, multimodal fusion, and intelligent systems to support national strategic goals and the integration of the digital and real economies.

Artificial Intelligence Deep Reinforcement Learning multimodal perception GLM5V Turbo industrial time series SenseNova

Written by

Data Party THU

Official platform of Tsinghua Big Data Research Center, sharing the team's latest research, teaching updates, and big data news.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.