How to Install and Run LLaMA‑3 Locally with Ollama and Open‑WebUI

This guide explains how to set up the open‑source LLaMA‑3 model using Ollama, pull the 8B model, configure Open‑WebUI in Docker, and interact with the model locally, including Chinese response handling and memory considerations.

Open Source Tech Hub
Open Source Tech Hub
Open Source Tech Hub
How to Install and Run LLaMA‑3 Locally with Ollama and Open‑WebUI

LLaMA‑3 (Large Language Model Meta AI 3) is Meta's open‑source generative AI model, available in 8B, 70B, and an upcoming 400B version, aiming for multimodal and multilingual capabilities comparable to GPT‑4.

Installing Ollama

Ollama is an open‑source LLM service tool that lets you run large language models locally. It simplifies deployment via Docker containers and provides a command‑line interface, REST API, and support for custom Modelfiles.

Official download page: https://ollama.com/download

Ollama runs on macOS and Linux and offers Docker images for easy installation.

Model Management

Downloading the Model

ollama pull llama3:8b
The default tag llama3:8b denotes the 8‑billion‑parameter model; other tags can be listed on the Ollama repository.

Testing the Model

To receive Chinese responses, first input: 你好!请中文回复

Configuring Open‑WebUI

Running on CPU with Docker

docker run -d -p 3000:8080 --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main

Accessing the Interface

Open a browser at http://127.0.0.1:3000 . The first visit requires account registration; after logging in, you can switch the UI language to Chinese.

Downloading the Model in Open‑WebUI

llama3:8b

After the download completes, the model is ready for use.

Using the Model

Select the downloaded model in Open‑WebUI and start a chat. Remember to prepend 你好!请中文回复 if you need Chinese answers.

Memory Considerations

The guide includes a screenshot of memory usage, indicating that the 8B model can run on typical consumer‑grade hardware when using CPU.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

DockerOllamaOpen WebUIlocal LLM deploymentLlama 3
Open Source Tech Hub
Written by

Open Source Tech Hub

Sharing cutting-edge internet technologies and practical AI resources.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.