Artificial Intelligence 5 min read

Run Llama 3 Locally on PC/Mac: Ollama, LM Studio & GPT4All Guide

This guide walks you through three practical methods—using Ollama, LM Studio, and GPT4All—to install and run the open‑source Llama 3 model locally on Windows, macOS, or Ubuntu, including command‑line usage, Python integration, and prompt‑engineering techniques for formatted outputs.

21CTO

Apr 22, 2024

Run Llama 3 Locally on PC/Mac: Ollama, LM Studio & GPT4All Guide

Introduction: Llama 3 can now be run locally on PC or Mac thanks to open‑source tools.

1. Using Ollama

Supported platforms: macOS, Ubuntu, Windows (preview).

Download Ollama from https://ollama.com/.

Run the model with ollama run llama3 (downloads the 8B instruction model by default). Use tags to select other variants, e.g., ollama run llama3:70b-instruct.

Example Python script to call the local API:

# llm_chat.py
import requests

res = requests.post(
    "http://127.0.0.1:11434/api/chat",
    json={
        "model": "mistral",
        "stream": false,
        "messages": [
            {"role": "user", "content": "Who are you?"}
        ],
    },
)
print(res.json())

To obtain structured output, use prompt engineering to ask the model to respond in YAML:

# llm_chat.py
import requests

res = requests.post(
    "http://localhost:11434/api/chat",
    json={
        "model": "mistral",
        "stream": false,
        "messages": [
            {
                "role": "user",
                "content": """Who are you? Respond in YAML format:
name: string
language: string""",
            },
        ],
    },
)
print(res.json())

The model returns:

name: "AI Assistant"
language: "English"

2. Using LM Studio

Supported platforms: macOS, Ubuntu, Windows.

Features: built on llama.cpp, supports various models such as ggml Llama, MPT, and StarCoder from Hugging Face.

Download LM Studio from https://lmstudio.ai/ and install according to system requirements.

LM Studio includes a built‑in chat interface for easy interaction.

3. Using GPT4All

Supported platforms: macOS, Ubuntu, Windows.

GPT4All provides a generic setup to run many open‑source LLMs, though it may require more DIY configuration and familiarity with programming environments.

Conclusion: Each method offers a way to run Llama 3 locally, catering to different technical expertise levels.

Python prompt engineering Local Deployment Ollama Llama3 LM Studio GPT4All