Tagged articles

MiniGPT-4

1 articles · Page 1 of 1

Sep 26, 2023 · Artificial Intelligence

MiniGPT-4: Enhancing Vision‑Language Understanding with Large Language Models

This article presents MiniGPT-4, a multimodal system that combines a frozen visual encoder (Q‑Former + ViT) with an open‑source large language model (Vicuna), describes its motivation, training pipeline, demo capabilities, observed limitations, and includes a brief Q&A session.

AI researchImage CaptioningMiniGPT-4

0 likes · 15 min read

MiniGPT-4: Enhancing Vision‑Language Understanding with Large Language Models