Artificial Intelligence 8 min read

How Alibaba’s ‘City Brain’ Powered Cutting‑Edge AI Research at ACM MM 2017

Three Alibaba iDST papers on the AI‑driven City Brain were selected for ACM Multimedia 2017, showcasing novel video anomaly detection, deep siamese re‑identification, and stylized image generation methods that improve urban traffic management and demonstrate the broader potential of Alibaba’s ET Brain platform.

Alibaba Cloud Developer

Oct 31, 2017

How Alibaba’s ‘City Brain’ Powered Cutting‑Edge AI Research at ACM MM 2017

ACM Multimedia 2017 Highlights

Three papers from Alibaba’s Institute of Data Science and Technologies (iDST) on the “City Brain” were selected for ACM MM 2017, and the authors were invited to present at the conference.

About ACM MM and the Selected Papers

ACM MM is a top‑tier international multimedia conference (CCF A‑class). Only 7.5 % of submissions receive oral invitations, and the selection was described as “highly competitive”. The three Alibaba papers address traffic‑accident analysis, crowd‑trajectory modeling, and traffic‑data sampling, all drawn from real‑world City Brain deployments.

Alibaba’s City Brain Platform

Since 2016 Alibaba has operated the AI‑driven “ET City Brain”, an intelligent hub that uses city‑wide camera networks to collect data, perform real‑time video analytics, and automatically allocate public resources. The system has been deployed in Hangzhou, Suzhou and other cities.

In Hangzhou, after one year of testing the City Brain controlled 128 signal‑light intersections, cutting average travel time by 15.3 % and saving 4.6 minutes on elevated roads. It also generated more than 500 traffic‑event alerts per day with 92 % accuracy and reduced ambulance arrival time by half in Xiaoshan district.

Conference Presentations

iDST researchers Shen Chen, Zhao Yiru, and others presented two papers:

Spatio‑Temporal AutoEncoder for Video Anomaly Detection – proposes a spatio‑temporal auto‑encoder and a decaying‑weight prediction error metric; evaluation on real traffic video shows superior performance over previous methods.

Deep Siamese Network with Multi‑level Similarity Perception for Person Re‑identification – combines Siamese and classification networks with multi‑level similarity, achieving state‑of‑the‑art accuracy on large‑scale re‑identification datasets.

AI Technologies Behind City Brain

Distinguished Engineer and iDST Vice‑Dean Hua Xiansheng explained that City Brain integrates visual cognition, optimization decision‑making, visual search, prediction, and large‑scale real‑time video processing. The platform also served as the basis for the LSVC (Large‑Scale Video Classification) challenge, where Alibaba’s team achieved an average accuracy of 87.41 % and won the competition.

Broader “ET Brain” Applications

Beyond the City Brain, Alibaba’s “ET Brain” powers “Industrial Brain”, “Medical Brain”, “Environmental Brain” and other domains, illustrating the company’s ongoing commitment to advancing AI research and real‑world deployment.

Selected Papers and Abstracts

Spatio‑Temporal AutoEncoder for Video Anomaly Detection Authors: Zhao Yiru, Deng Bing, Shen Chen, Liu Yao, Lu Hongtao, Hua Xiansheng Abstract: Provides a method for monitoring traffic anomalies in the City Brain. Inspired by recent advances in action recognition, it designs a spatio‑temporal auto‑encoder for video anomaly detection and introduces a weight‑decay prediction error. Experiments on real traffic scenes show the method outperforms previous best results on key metrics.

Deep Siamese Network with Multi‑level Similarity Perception for Person Re‑identification Authors: Shen Chen, Jin Zhongming, Zhao Yiru, Fu Zhihang, Jiang Rongxin, Chen Yaowu, Hua Xiansheng Abstract: Supplies technical support for recognizing crowd trajectories. By combining the strengths of Siamese and classification networks and extending similarity to multiple levels, it achieves the highest reported retrieval precision on large‑scale public re‑identification datasets.

Stylized Adversarial Autoencoder for Image Generation Authors: Zhao Yiru, Deng Bing, Huang Jianqiang, Lu Hongtao, Hua Xiansheng Abstract: Addresses the shortage of traffic video samples for the City Brain. Inspired by conditional GANs and style‑transfer learning, it extracts content and style features from separate images, fuses them, and generates images that combine the desired content and style.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

multimedia Video Anomaly Detection

Written by

Alibaba Cloud Developer

Alibaba's official tech channel, featuring all of its technology innovations.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.