Baidu Geek Talk
Baidu Geek Talk
Dec 25, 2024 · Industry Insights

How to Build a Multimodal Web Page Model for the LLM Era

This article examines the unique multimodal and multi‑granular nature of web pages, compares fusion strategies, proposes a cross‑modal attention approach, outlines fine‑ and coarse‑grained pre‑training tasks, and explores low‑cost adaptor methods for adapting large multimodal models to web‑page modeling in the LLM era.

AIHTMLLLM adaptation
0 likes · 10 min read
How to Build a Multimodal Web Page Model for the LLM Era