local LLM deployment — 7 Technical Articles

Apr 10, 2026 · Artificial Intelligence

Run Gemma 4 with OpenClaw in Three Simple Steps – Official Google Guide

This article walks through Google’s official three‑step tutorial for connecting the Gemma 4 language model to OpenClaw using Ollama, details hardware requirements, discusses performance and security considerations, and evaluates the model’s capabilities compared to larger LLMs.

Gemma 4Mac StudioOllama

0 likes · 5 min read

Run Gemma 4 with OpenClaw in Three Simple Steps – Official Google Guide

Old Zhang's AI Learning

Mar 25, 2026 · Artificial Intelligence

Claude‑Opus‑4.6 Distilled Qwen3.5 v2: Faster Reasoning with Same Code Accuracy

The new Claude‑Opus‑4.6 distilled Qwen3.5‑v2 keeps code‑generation accuracy while cutting reasoning length by 24% and boosting per‑token correctness by 31.6%, offering a noticeable speed and cost advantage for local LLM deployment despite a 7.2% drop on MMLU‑Pro.

Claude OpusDistillationQwen3.5

0 likes · 7 min read

Claude‑Opus‑4.6 Distilled Qwen3.5 v2: Faster Reasoning with Same Code Accuracy

Old Zhang's AI Learning

Mar 16, 2026 · Artificial Intelligence

Testing Claude‑Opus‑4.6 Distilled Qwen3.5 9B Model Locally via LM Studio and Claude Code

The article evaluates the GGUF‑quantized Claude‑Opus‑4.6 distilled Qwen3.5 9B model on a 16 GB Mac Mini M4 using LM Studio, detailing model sizes, performance metrics, deployment steps, API integration with Claude Code, and concluding that while the 9B version is usable, its capabilities remain limited compared to larger models.

Claude OpusGGUFLM Studio

0 likes · 12 min read

Testing Claude‑Opus‑4.6 Distilled Qwen3.5 9B Model Locally via LM Studio and Claude Code

AI Engineering

Mar 11, 2026 · Artificial Intelligence

Run Claude Code Locally with Qwen 3.5 to Skip Anthropic API Costs

This guide shows how to replace Anthropic's API by running a local Qwen 3.5 model with llama.cpp, configuring Claude Code via ANTHROPIC_BASE_URL, and includes hardware checks, build steps, model download, server launch, speed‑fix tips, and usage instructions for secure, cost‑free development.

Anthropic APIClaude CodeGPU acceleration

0 likes · 8 min read

Run Claude Code Locally with Qwen 3.5 to Skip Anthropic API Costs

Old Zhang's AI Learning

Mar 4, 2026 · Artificial Intelligence

Unlock the Full Power of LM Studio for Local LLM Deployment

This article explores LM Studio’s evolution into a complete local AI development platform, detailing version 0.4’s architectural overhaul, headless daemon, parallel request handling, stateful REST API, UI refresh, and a suite of hidden developer features such as OpenAI‑compatible, Anthropic‑compatible APIs, CLI tools, native SDKs, and the LM Link remote‑model solution.

Anthropic APICLILM Link

0 likes · 12 min read

Unlock the Full Power of LM Studio for Local LLM Deployment

Ops Development Stories

Jul 14, 2025 · Artificial Intelligence

Mastering AIOps: Prompt Engineering, Function Calling, RAG, Graph RAG, and Local LLM Deployment

This comprehensive guide explores AIOps techniques such as prompt engineering, chat completions, memory management, function calling, fine‑tuning, retrieval‑augmented generation (RAG), graph‑based RAG, and practical steps for deploying open‑source large language models locally, providing code examples and best‑practice recommendations for modern DevOps environments.

Function CallingGraph RAGRAG

0 likes · 47 min read

Mastering AIOps: Prompt Engineering, Function Calling, RAG, Graph RAG, and Local LLM Deployment

Open Source Tech Hub

Apr 25, 2024 · Backend Development

How to Install and Run LLaMA‑3 Locally with Ollama and Open‑WebUI

This guide explains how to set up the open‑source LLaMA‑3 model using Ollama, pull the 8B model, configure Open‑WebUI in Docker, and interact with the model locally, including Chinese response handling and memory considerations.

DockerLlama 3Ollama

0 likes · 4 min read

How to Install and Run LLaMA‑3 Locally with Ollama and Open‑WebUI

Run Gemma 4 with OpenClaw in Three Simple Steps – Official Google Guide

Claude‑Opus‑4.6 Distilled Qwen3.5 v2: Faster Reasoning with Same Code Accuracy

Testing Claude‑Opus‑4.6 Distilled Qwen3.5 9B Model Locally via LM Studio and Claude Code

Run Claude Code Locally with Qwen 3.5 to Skip Anthropic API Costs

Unlock the Full Power of LM Studio for Local LLM Deployment

Mastering AIOps: Prompt Engineering, Function Calling, RAG, Graph RAG, and Local LLM Deployment

How to Install and Run LLaMA‑3 Locally with Ollama and Open‑WebUI

Run Gemma 4 with OpenClaw in Three Simple Steps – Official Google Guide

Run Claude Code Locally with Qwen 3.5 to Skip Anthropic API Costs