Tagged articles
303 articles
Page 4 of 4
Youku Technology
Youku Technology
Apr 2, 2019 · Artificial Intelligence

How Youku Uses Multimodal AI for Video Understanding, Search, and Recommendation

Youku’s Algorithm Center has built a multimodal AI pipeline that jointly processes visual, audio, and textual signals to enhance video search, recommendation, and digital asset management, overcoming traditional keyword limits, improving relevance and cold‑start issues, while tackling fusion, cost, and interpretability challenges.

Multimodal AIRecommendation Systemscontent understanding
0 likes · 15 min read
How Youku Uses Multimodal AI for Video Understanding, Search, and Recommendation
JD Tech
JD Tech
Aug 14, 2018 · Artificial Intelligence

GCN‑LSTM Image Captioning Model by JD AI Research Institute

JD AI Research Institute presented a GCN‑LSTM encoder‑decoder system that integrates object semantic and spatial relationships via graph convolutional networks to significantly improve image captioning performance on the COCO benchmark, achieving state‑of‑the‑art results.

COCO datasetImage CaptioningLSTM
0 likes · 7 min read
GCN‑LSTM Image Captioning Model by JD AI Research Institute
Alibaba Cloud Developer
Alibaba Cloud Developer
Dec 7, 2017 · Artificial Intelligence

How Alibaba’s AI Powers Voice Ticketing and Facial Recognition in Shanghai Metro

Alibaba’s AI-driven solutions enable Shanghai Metro passengers to buy tickets by simply speaking, recognize faces at turnstiles, and analyze crowd flow in real time, showcasing multimodal voice‑vision interaction, far‑field speech recognition in noisy stations, and advanced computer‑vision techniques.

Multimodal AISmart Transitfacial recognition
0 likes · 10 min read
How Alibaba’s AI Powers Voice Ticketing and Facial Recognition in Shanghai Metro