BestHub
Discover
Artificial IntelligenceBackend DevelopmentMobile DevelopmentProduct ManagementCloud NativeFrontend DevelopmentFundamentalsBig DataCloud ComputingGame DevelopmentR&D ManagementOperationsDatabasesInformation SecurityBlockchainUser Experience DesignInterview ExperienceIndustry Insights
View all →
TopicsTagsTrendsRanking
Sign in
Discover
Artificial Intelligence Backend Development Mobile Development Product Management Cloud Native Frontend Development Fundamentals Big Data Cloud Computing Game Development R&D Management Operations Databases Information Security Blockchain User Experience Design Interview Experience Industry Insights View all →
TopicsTagsTrendsRanking
Sign in
  1. Home
  2. / Tags
  3. / image‑text pairs
Baobao Algorithm Notes
Baobao Algorithm Notes
Mar 24, 2022 · Artificial Intelligence

Exploring WuDaoMM: A 650M Chinese‑English Multimodal Dataset for Pre‑training

The article introduces WuDaoMM and WuDaoCorpora 2.0, massive Chinese‑English multimodal datasets—including 650 million image‑text pairs, 3 TB of text, 93 TB of images, and 181 GB of dialogue—detailing their composition, formats, access options, and potential research applications.

Chinese AIPre‑trainingWuDaoMM
0 likes · 6 min read
Exploring WuDaoMM: A 650M Chinese‑English Multimodal Dataset for Pre‑training
BestHub

Editorial precision for engineers who prefer signal over noise. Deep reads, careful curation, and sharper frontiers in software.

Best Hub for Dev. Power Your Build.
Navigation
Status Discover Tags Topics System Status Privacy Terms Rss Feed