Tagged articles

local LLM

37 articles · Page 1 of 1
Machine Heart
Machine Heart
Jun 24, 2026 · Artificial Intelligence

Running Large Language Models Locally Is Now Surprisingly Easy

The article explains how recent advances in LLM performance now allow developers to run sophisticated AI models locally on a 2022 M2 Mac using tools like LM Studio, Pi agent, and Docker, detailing model choices, setup steps, performance observations, and remaining limitations.

DockerGemma 4LM Studio
0 likes · 10 min read
Running Large Language Models Locally Is Now Surprisingly Easy
AI Engineer Programming
AI Engineer Programming
Jun 17, 2026 · Artificial Intelligence

Local LLMs Viable: Sparse Attention, MoE, KV Compression, Multi‑Token Prediction

In early 2026, open‑source local large language models become practical alternatives thanks to sparse attention, MoE routing, latent KV compression, multi‑token prediction, and 4‑bit quantization, while hardware memory shortages and benchmark gaps with closed‑source models shape their deployment choices.

4-bit quantizationKV compressionMixture of Experts
0 likes · 13 min read
Local LLMs Viable: Sparse Attention, MoE, KV Compression, Multi‑Token Prediction
Java Companion
Java Companion
Jun 7, 2026 · Artificial Intelligence

Why Odysseus Gained 50,000 Stars in 5 Days: Inside the Open‑Source AI Workbench

The article reviews the open‑source AI workbench Odysseus, explaining its self‑hosted ChatGPT‑like UI, modular features such as Cookbook, Agent and Deep Research, deployment steps with Docker, hardware constraints, community reactions, and why it attracted over 50 K GitHub stars in just five days.

AI workstationDocker deploymentModel Management
0 likes · 12 min read
Why Odysseus Gained 50,000 Stars in 5 Days: Inside the Open‑Source AI Workbench
Old Zhang's AI Learning
Old Zhang's AI Learning
Jun 2, 2026 · Artificial Intelligence

Turn Local LLMs into Actionable Agents – Unsloth Opens the MCP Path

Unsloth now lets locally‑run large language models act as real agents by exposing a Model Context Protocol (MCP) interface through a no‑code Studio UI or a llama.cpp + mcp‑cli command line, supporting tool calling, file access, web search, and multi‑model connections with detailed setup steps, hardware guidance, and security cautions.

AI agentsMCPModel Context Protocol
0 likes · 17 min read
Turn Local LLMs into Actionable Agents – Unsloth Opens the MCP Path
AI Engineering
AI Engineering
May 24, 2026 · Artificial Intelligence

Build a Local AI Agent from Scratch: A Deep‑Dive, Non‑Fast‑Food Tutorial

This tutorial walks you through the open‑source “AI Agents From Scratch” project, teaching how to build a fully local AI agent without any pre‑made framework by covering core modules, 14 step‑by‑step examples, advanced reasoning architectures, and minimal system requirements.

AI AgentChain-of-ThoughtPrompt engineering
0 likes · 6 min read
Build a Local AI Agent from Scratch: A Deep‑Dive, Non‑Fast‑Food Tutorial
DevOps Coach
DevOps Coach
Apr 23, 2026 · Artificial Intelligence

Can Gemma 4 on a MacBook Pro or NVIDIA Blackwell Replace Cloud LLMs? A Hands‑On Performance Study

The author benchmarks Gemma 4 locally on a 24 GB M4 Pro MacBook Pro (llama.cpp) and on a Dell GB10 with an NVIDIA Blackwell GPU (Ollama), comparing token speed, tool‑call reliability, and task completion against cloud GPT‑5.4, showing the Mac runs faster per token but the Blackwell system achieves higher first‑pass success with fewer retries, and that the jump from Gemma 3 to Gemma 4 dramatically improves agentic coding viability.

BenchmarkGemma 4MacBook Pro
0 likes · 15 min read
Can Gemma 4 on a MacBook Pro or NVIDIA Blackwell Replace Cloud LLMs? A Hands‑On Performance Study
Coder Trainee
Coder Trainee
Apr 20, 2026 · Artificial Intelligence

How to Install and Configure Ollama Locally for a CRM AI Engine

This guide walks through installing Ollama on Windows 10, downloading a Chinese‑friendly LLM such as Qwen2, configuring a CRM’s application‑dev.yml to point to the local Ollama service, restarting the backend, and handling optional CORS settings, highlighting zero‑cost, privacy, and stability benefits.

AI DeploymentCRM integrationOllama
0 likes · 4 min read
How to Install and Configure Ollama Locally for a CRM AI Engine
TonyBai
TonyBai
Apr 18, 2026 · Industry Insights

Why Ollama Fell From Open‑Source Hero to Community Villain

The article revisits Ollama’s rise as a user‑friendly local LLM runner, then details the community backlash over its omission of llama.cpp credit, the introduction of a private model format, performance regressions, and a VC‑driven commercialization pattern, while presenting open‑source alternatives.

OllamaOpen-sourceVC trap
0 likes · 9 min read
Why Ollama Fell From Open‑Source Hero to Community Villain
Advanced AI Application Practice
Advanced AI Application Practice
Mar 24, 2026 · Artificial Intelligence

Connecting OpenClaw to Ollama: Step‑by‑Step Guide and Common Pitfalls

This article explains why Ollama has become popular for local LLM deployment, outlines its core features, and provides a detailed, step‑by‑step tutorial for integrating OpenClaw with Ollama—including model selection, configuration, troubleshooting common errors, and advanced tips for customization and multi‑model switching.

AIModel DeploymentOllama
0 likes · 9 min read
Connecting OpenClaw to Ollama: Step‑by‑Step Guide and Common Pitfalls
Old Zhang's AI Learning
Old Zhang's AI Learning
Mar 22, 2026 · Artificial Intelligence

Hands‑On Review: Unsloth Studio’s One‑Stop Local LLM Console (Windows‑Ready)

The author tests Unsloth Studio, a local web UI that unifies model download, execution, dataset handling, training, fine‑tuning and export, supporting GGUF and safetensors formats across Windows, macOS and Linux, and highlights its integrated tool‑calling, data‑recipe workflow, observability features, installation quirks, and target user scenarios.

GGUFModel TrainingSafetensors
0 likes · 9 min read
Hands‑On Review: Unsloth Studio’s One‑Stop Local LLM Console (Windows‑Ready)
Old Zhang's AI Learning
Old Zhang's AI Learning
Mar 19, 2026 · Artificial Intelligence

Testing the Hot oMLX on Mac: Claude‑Opus‑4.6 Distilled and Qwen3.5‑9B Performance Review

The article evaluates oMLX, a Mac‑only LLM runtime built on Apple Silicon and MLX, by walking through installation, UI features, memory usage, single‑request speed, benchmark results for Claude‑Opus‑4.6 and Qwen3.5‑9B, continuous batch processing gains, Claude Code optimizations, multi‑model support, and the failure to run a 27B model.

Apple SiliconBenchmarkClaude Opus
0 likes · 9 min read
Testing the Hot oMLX on Mac: Claude‑Opus‑4.6 Distilled and Qwen3.5‑9B Performance Review
Ubuntu
Ubuntu
Jan 25, 2026 · Artificial Intelligence

Unlock Productivity: Create a Full‑Featured AI Coding Workflow on Ubuntu with CC Switch and Ollama

This step‑by‑step guide shows how to install Ollama on Ubuntu, download DeepSeek‑Coder‑V2 or Qwen2.5‑Coder models, set up Claude Code, Codex, and Gemini CLI clients, configure the open‑source CC Switch proxy to route their requests to the local Ollama engine, and run a test prompt that generates Python code without any external API keys.

AI codingCC SwitchClaude Code
0 likes · 8 min read
Unlock Productivity: Create a Full‑Featured AI Coding Workflow on Ubuntu with CC Switch and Ollama
PaperAgent
PaperAgent
Jan 24, 2026 · Artificial Intelligence

How a Local 8B LLM Beats Closed‑Source Giants in Deep Research

AgentCPM-Report is a locally deployable, privacy‑preserving AI agent that matches or exceeds the performance of top closed‑source large‑model systems on deep‑research benchmarks, offering end‑to‑end report generation without uploading any confidential data to the cloud.

AI AgentBenchmarkDeep Research
0 likes · 8 min read
How a Local 8B LLM Beats Closed‑Source Giants in Deep Research
Ubuntu
Ubuntu
Jan 24, 2026 · Artificial Intelligence

Unlock Full‑Stack AI Coding on Ubuntu with Ollama and CC Switch

This step‑by‑step guide shows how to replace cloud‑based AI coding tools with a private, zero‑cost workflow on Ubuntu by installing Ollama, configuring systemd, adding DeepSeek or Qwen2.5 models, installing Claude, Codex and Gemini CLIs, and routing them through CC Switch.

AI codingCC SwitchClaude Code
0 likes · 7 min read
Unlock Full‑Stack AI Coding on Ubuntu with Ollama and CC Switch
Ubuntu
Ubuntu
Jan 23, 2026 · Artificial Intelligence

Deploy DeepSeek Locally on Ubuntu: Build Your Private AI Assistant

This guide walks through why you might run a large language model locally—privacy, zero latency, and no token costs—then details hardware requirements, installs Ollama, pulls the appropriate DeepSeek‑R1 model, tests it with a coding prompt, and optionally adds a web UI via Docker.

AI assistantDeepSeekOllama
0 likes · 6 min read
Deploy DeepSeek Locally on Ubuntu: Build Your Private AI Assistant
AI Engineering
AI Engineering
Jan 20, 2026 · Artificial Intelligence

How mcpx Cuts Token Overhead in MCP Tool Calls for Local LLMs

The article explains how mcpx reduces MCP tool definition tokens from tens of thousands to a few hundred by discovering tools at execution time, improving accuracy and speed for local large language models while preserving prompt cache integrity.

AnthropicMCPToken optimization
0 likes · 6 min read
How mcpx Cuts Token Overhead in MCP Tool Calls for Local LLMs
AI Insight Log
AI Insight Log
Jan 19, 2026 · Artificial Intelligence

Run Claude Code for Free? Ollama Adds Anthropic API Compatibility

Ollama v0.14.0 now supports the Anthropic API, letting you run Claude Code locally with open‑source models like Qwen or Llama without an API key, network, or cost, and the article provides a step‑by‑step setup, SDK examples, and an objective assessment of the approach.

Anthropic APIClaude CodeOllama
0 likes · 7 min read
Run Claude Code for Free? Ollama Adds Anthropic API Compatibility
Ubuntu
Ubuntu
Jan 12, 2026 · Artificial Intelligence

How to Deploy a Privacy‑First AI Agent Workflow on Ubuntu (No Cloud Needed)

The article explains why running AI locally on Ubuntu offers data security, zero token costs, offline capability, and millisecond response times, then provides a step‑by‑step guide to install Ollama via Snap, pull the DeepSeek Coder 6.7B model, optimize GPU drivers and memory, integrate with VS Code, and monitor resource usage in real time.

DeepSeek CoderGPU OptimizationOllama
0 likes · 5 min read
How to Deploy a Privacy‑First AI Agent Workflow on Ubuntu (No Cloud Needed)
Code Wrench
Code Wrench
Dec 6, 2025 · Artificial Intelligence

Build a Local Go AI Agent with Ollama and DeepSeek – MVP Guide

This article walks you through creating a fully offline, extensible AI programming assistant in Go, using Ollama and DeepSeek‑R1, covering project layout, message formats, function calling, tool integration, a simple WebSocket UI, and future extension ideas.

AI AgentGoOllama
0 likes · 10 min read
Build a Local Go AI Agent with Ollama and DeepSeek – MVP Guide
Raymond Ops
Raymond Ops
Sep 23, 2025 · Artificial Intelligence

Install Ollama’s Local LLM on Windows and Power It with ShellGPT

This guide walks you through installing the Ollama local large‑language‑model runtime on Windows, deploying a Gemma2 model, then setting up ShellGPT on Linux to interact with the local LLM, covering configuration, basic commands, and advanced usage examples.

AI assistantLinuxOllama
0 likes · 6 min read
Install Ollama’s Local LLM on Windows and Power It with ShellGPT
21CTO
21CTO
Jul 22, 2025 · Artificial Intelligence

Run Powerful LLMs Locally on <8GB RAM: Top 10 Small Models & Tools

This article explains how advanced quantization and model optimization enable running strong large language models on laptops or desktops with less than 8 GB of RAM or VRAM, outlines key technical concepts, recommends local inference tools, and lists ten compact LLMs with usage commands.

AILLM toolsOllama
0 likes · 10 min read
Run Powerful LLMs Locally on <8GB RAM: Top 10 Small Models & Tools
JavaEdge
JavaEdge
Apr 26, 2025 · Artificial Intelligence

Turn LM Studio into a Local OpenAI‑Compatible API Server

This guide shows how to select a model in LM Studio, expose a local port, start the HTTP server, and interact with it via curl commands, covering quick model listing, chat requests, and the difference between streaming and full‑response modes.

AIAPILM Studio
0 likes · 5 min read
Turn LM Studio into a Local OpenAI‑Compatible API Server
MaGe Linux Operations
MaGe Linux Operations
Mar 21, 2025 · Artificial Intelligence

Step‑by‑Step Guide to Install Ollama and ShellGPT for Local LLM Use

This tutorial walks you through installing Ollama on Windows, configuring and running a local large language model, then setting up ShellGPT on Linux to communicate with Ollama, including configuration files, command examples, and REPL usage, while omitting unrelated promotional content.

AI assistantOllamaShellGPT
0 likes · 6 min read
Step‑by‑Step Guide to Install Ollama and ShellGPT for Local LLM Use
Liangxu Linux
Liangxu Linux
Feb 16, 2025 · Artificial Intelligence

Build a Free Private AI with DeepSeek, Ollama, and Local Knowledge Base

This guide explains how to locally deploy the open‑source DeepSeek model using Ollama, enhance interaction with Chatbox and Page Assist, and connect a local knowledge base via AnythingLLM's RAG architecture, providing step‑by‑step instructions, hardware requirements, and API examples for a self‑hosted AI system.

AI DeploymentAnythingLLMDeepSeek
0 likes · 22 min read
Build a Free Private AI with DeepSeek, Ollama, and Local Knowledge Base
21CTO
21CTO
Feb 13, 2025 · Artificial Intelligence

JetBrains AI Assistant Now Supports Local LLMs and the Latest OpenAI Models

JetBrains has updated its AI Assistant to run local large language models for enhanced privacy, added support for Anthropic's Claude 3.5 series and OpenAI's o1, o1‑mini, and o3‑mini models, and highlighted faster, cost‑effective inference for coding and scientific tasks.

AI assistantJetBrainsOpenAI Models
0 likes · 3 min read
JetBrains AI Assistant Now Supports Local LLMs and the Latest OpenAI Models
JD Tech Talk
JD Tech Talk
Feb 7, 2025 · Artificial Intelligence

Building a Local AI Assistant with DeepSeek and Chatbox Using Ollama

This step‑by‑step tutorial shows beginners how to install Ollama, deploy the DeepSeek large language model locally, and configure the Chatbox AI client to create a functional AI assistant on Windows, macOS, Linux, or mobile devices within ten minutes.

AI assistantChatboxDeepSeek
0 likes · 5 min read
Building a Local AI Assistant with DeepSeek and Chatbox Using Ollama
21CTO
21CTO
May 31, 2023 · Artificial Intelligence

How to Build a Private, Offline GPT with Python – Step‑by‑Step Guide

This tutorial explains how to set up PrivateGPT, a Python‑based offline LLM solution that runs locally without sending any data to the cloud, covering environment preparation, model download, repository cloning, data ingestion, and interactive querying.

Offline AIPrivateGPTPython
0 likes · 5 min read
How to Build a Private, Offline GPT with Python – Step‑by‑Step Guide