Baidu Geek Talk
Dec 25, 2024 · Industry Insights
How to Build a Multimodal Web Page Model for the LLM Era
This article examines the unique multimodal and multi‑granular nature of web pages, compares fusion strategies, proposes a cross‑modal attention approach, outlines fine‑ and coarse‑grained pre‑training tasks, and explores low‑cost adaptor methods for adapting large multimodal models to web‑page modeling in the LLM era.
AIHTMLLLM adaptation
0 likes · 10 min read
