Tagged articles
152 articles
Page 1 of 2
Old Zhang's AI Learning
Old Zhang's AI Learning
May 11, 2026 · Information Security

Critical CVE-2026-7482 'Bleeding Llama' in Ollama: Why You Must Upgrade Now

Ollama versions before 0.17.1 suffer a CVSS 9.1 heap out‑of‑bounds read vulnerability (CVE‑2026‑7482) that lets attackers upload malicious GGUF files, read server memory—including env vars and API keys—and exfiltrate data, affecting over 300,000 publicly exposed servers, so immediate upgrade and hardening are essential.

API vulnerabilityBleeding LlamaCVE-2026-7482
0 likes · 5 min read
Critical CVE-2026-7482 'Bleeding Llama' in Ollama: Why You Must Upgrade Now
Black & White Path
Black & White Path
May 9, 2026 · Information Security

Ollama ‘Bleeding Llama’ Vulnerability Puts 300K Servers at Risk of Sensitive Data Exposure

A critical CVE‑2026‑7482 flaw in Ollama’s model quantization pipeline, dubbed “Bleeding Llama,” allows unauthenticated attackers to craft GGUF files that read beyond buffer limits, potentially leaking prompts, API keys and other confidential data from over 300,000 internet‑exposed servers, with mitigation requiring an upgrade to version 0.17.1 and stricter network controls.

AI securityBleeding LlamaCVE-2026-7482
0 likes · 5 min read
Ollama ‘Bleeding Llama’ Vulnerability Puts 300K Servers at Risk of Sensitive Data Exposure
Java Web Project
Java Web Project
Apr 29, 2026 · Backend Development

Run Claude Code in VS Code for Free with a One‑Time Proxy Setup

This guide shows how to bypass Claude Code's paid Anthropic API by installing a local proxy that forwards requests to free models such as DeepSeek, Ollama, or NVIDIA NIM, covering all required tools, configuration steps, and troubleshooting tips.

Claude CodeDeepSeekFree AI
0 likes · 10 min read
Run Claude Code in VS Code for Free with a One‑Time Proxy Setup
The Dominant Programmer
The Dominant Programmer
Apr 28, 2026 · Backend Development

Spring Boot, LangChain4j & Ollama: Chain for Intent Recognition and Task Dispatch

The article demonstrates how to construct a Spring Boot application that orchestrates multiple AI services using LangChain4j and Ollama, defining intent‑classification and tool‑based assistants, registering them as beans, and routing user requests through a controller to achieve multi‑step intent recognition and task dispatch in a simulated intelligent customer‑service workflow.

AI orchestrationLangChain4jOllama
0 likes · 13 min read
Spring Boot, LangChain4j & Ollama: Chain for Intent Recognition and Task Dispatch
The Dominant Programmer
The Dominant Programmer
Apr 27, 2026 · Artificial Intelligence

Build and Integrate a Local LLM with Spring Boot, LangChain4j, and Ollama

This guide walks through installing Ollama on Windows, downloading a Qwen2.5‑7B model, configuring Spring Boot with LangChain4j dependencies, setting up application.yml, defining AI service interfaces, adding conversation memory, creating REST and streaming controllers, and testing the end‑to‑end local LLM workflow.

ChatbotLLMLangChain4j
0 likes · 12 min read
Build and Integrate a Local LLM with Spring Boot, LangChain4j, and Ollama
The Dominant Programmer
The Dominant Programmer
Apr 27, 2026 · Artificial Intelligence

Building a Private Document Vector Search with SpringBoot, LangChain4j, and Ollama RAG

This guide walks through why Retrieval‑Augmented Generation (RAG) is needed for large language models, explains the three‑step indexing and query workflow, details LangChain4j’s core components, and provides a complete SpringBoot example—including Maven setup, configuration, service code, and troubleshooting—to create a private document‑vector search system powered by Ollama.

EmbeddingLangChain4jOllama
0 likes · 13 min read
Building a Private Document Vector Search with SpringBoot, LangChain4j, and Ollama RAG
DevOps Coach
DevOps Coach
Apr 23, 2026 · Artificial Intelligence

Can Gemma 4 on a MacBook Pro or NVIDIA Blackwell Replace Cloud LLMs? A Hands‑On Performance Study

The author benchmarks Gemma 4 locally on a 24 GB M4 Pro MacBook Pro (llama.cpp) and on a Dell GB10 with an NVIDIA Blackwell GPU (Ollama), comparing token speed, tool‑call reliability, and task completion against cloud GPT‑5.4, showing the Mac runs faster per token but the Blackwell system achieves higher first‑pass success with fewer retries, and that the jump from Gemma 3 to Gemma 4 dramatically improves agentic coding viability.

Agentic CodingGemma 4Local-LLM
0 likes · 15 min read
Can Gemma 4 on a MacBook Pro or NVIDIA Blackwell Replace Cloud LLMs? A Hands‑On Performance Study
Coder Trainee
Coder Trainee
Apr 20, 2026 · Artificial Intelligence

How to Install and Configure Ollama Locally for a CRM AI Engine

This guide walks through installing Ollama on Windows 10, downloading a Chinese‑friendly LLM such as Qwen2, configuring a CRM’s application‑dev.yml to point to the local Ollama service, restarting the backend, and handling optional CORS settings, highlighting zero‑cost, privacy, and stability benefits.

AI deploymentCRM integrationLocal-LLM
0 likes · 4 min read
How to Install and Configure Ollama Locally for a CRM AI Engine
IT Services Circle
IT Services Circle
Apr 19, 2026 · Artificial Intelligence

How to Seamlessly Add AI Coding Assistants to IntelliJ IDEA

This guide walks you through configuring IntelliJ IDEA to use AI coding assistants like Claude, Codex, OpenAI‑compatible APIs, and local models via Ollama, covering plugin installation, provider setup, API key entry, and usage tips with screenshots.

AI AssistantClaudeCodex
0 likes · 6 min read
How to Seamlessly Add AI Coding Assistants to IntelliJ IDEA
James' Growth Diary
James' Growth Diary
Apr 13, 2026 · Frontend Development

Local Inference & Edge AI: Why Front‑End AI Is the Next Battlefield

Edge AI runs AI models directly in browsers or devices, offering zero latency, zero API cost, and full privacy, and the article explains the three technical breakthroughs that make it possible, compares WebLLM, Transformers.js and Ollama, and provides a hybrid architecture with concrete engineering challenges and solutions that can cut total AI costs by 40‑55% for typical front‑end applications.

OllamaTransformers.jsWebGPU
0 likes · 20 min read
Local Inference & Edge AI: Why Front‑End AI Is the Next Battlefield
Old Zhang's AI Learning
Old Zhang's AI Learning
Apr 12, 2026 · Artificial Intelligence

Deploy the Open‑Source MiniMax‑M2.7 Model Locally: Step‑by‑Step Guide

MiniMax‑M2.7, the newly open‑sourced 230‑billion‑parameter MoE model, offers self‑evolution, professional software engineering and agent capabilities, and can be deployed locally using Ollama, vLLM, SGLang or Docker with 4‑8 H200 GPUs, while the article details hardware needs, performance gains and tool‑calling/Thinking features.

DeploymentGPULLM
0 likes · 11 min read
Deploy the Open‑Source MiniMax‑M2.7 Model Locally: Step‑by‑Step Guide
Machine Heart
Machine Heart
Apr 10, 2026 · Artificial Intelligence

Run Gemma 4 with OpenClaw in Three Simple Steps – Official Google Guide

This article walks through Google’s official three‑step tutorial for connecting the Gemma 4 language model to OpenClaw using Ollama, details hardware requirements, discusses performance and security considerations, and evaluates the model’s capabilities compared to larger LLMs.

Gemma 4Mac StudioOllama
0 likes · 5 min read
Run Gemma 4 with OpenClaw in Three Simple Steps – Official Google Guide
Old Meng AI Explorer
Old Meng AI Explorer
Apr 2, 2026 · Artificial Intelligence

Slash Your AI Coding Costs: Connect Codex with Chinese Large Models in 10 Minutes

This guide shows how the high OpenAI Codex fees can be replaced by domestic large language models—DeepSeek, GLM‑4.7, Qwen3.5 and others—through three practical integration methods, providing step‑by‑step commands, configuration files, performance benchmarks and cost‑saving calculations for individual developers and teams.

AI CodingCodex integrationCost Optimization
0 likes · 20 min read
Slash Your AI Coding Costs: Connect Codex with Chinese Large Models in 10 Minutes
Test Development Learning Exchange
Test Development Learning Exchange
Mar 24, 2026 · Artificial Intelligence

Build a Test‑Specific AI Agent to Auto‑Generate Pytest Cases and Analyze Allure Reports

This guide presents an end‑to‑end solution for creating a test‑focused AI agent that indexes project code and defect data, integrates a large language model via LangChain, generates compliant Pytest cases, parses Allure reports, and offers deployment tips for seamless PyCharm integration.

AI AgentAllureLangChain
0 likes · 13 min read
Build a Test‑Specific AI Agent to Auto‑Generate Pytest Cases and Analyze Allure Reports
Advanced AI Application Practice
Advanced AI Application Practice
Mar 24, 2026 · Artificial Intelligence

Connecting OpenClaw to Ollama: Step‑by‑Step Guide and Common Pitfalls

This article explains why Ollama has become popular for local LLM deployment, outlines its core features, and provides a detailed, step‑by‑step tutorial for integrating OpenClaw with Ollama—including model selection, configuration, troubleshooting common errors, and advanced tips for customization and multi‑model switching.

Local-LLMModel DeploymentOllama
0 likes · 9 min read
Connecting OpenClaw to Ollama: Step‑by‑Step Guide and Common Pitfalls
Code Mala Tang
Code Mala Tang
Feb 20, 2026 · Artificial Intelligence

How to Integrate Claude Code with Ollama for Local and Cloud LLM Workflows

This guide walks you through installing Claude Code and Ollama, pulling and configuring various open‑source models, setting environment variables, and running Claude Code with both local and cloud‑hosted models, while covering context length, performance considerations, and tool‑calling examples.

Claude CodeEnvironment VariablesLLM integration
0 likes · 14 min read
How to Integrate Claude Code with Ollama for Local and Cloud LLM Workflows
Old Zhang's AI Learning
Old Zhang's AI Learning
Feb 12, 2026 · Artificial Intelligence

Testing the World's Most Powerful Open‑Source LLM: GLM‑5, Local Deployment & Free Ollama Cloud

The article evaluates GLM‑5, the claimed strongest open‑source large language model, comparing its benchmark scores to Claude Opus, Gemini and GPT, detailing its DeepSeek‑inspired architecture, quantized FP8 deployment requirements, and step‑by‑step usage of Ollama’s free cloud model with Agent, data‑analysis and document‑generation features.

AI benchmarkingGLM-5Ollama
0 likes · 7 min read
Testing the World's Most Powerful Open‑Source LLM: GLM‑5, Local Deployment & Free Ollama Cloud
Old Zhang's AI Learning
Old Zhang's AI Learning
Feb 3, 2026 · Artificial Intelligence

Why GLM-OCR Leads OCR Benchmarks: 0.9B Model Tops OmniDocBench

GLM-OCR, a 0.9B‑parameter multimodal OCR model from Zhipu, achieves the highest score (94.62) on OmniDocBench V1.5, offers lightweight deployment via vLLM, Ollama, API and SDK, and outperforms larger rivals like DeepSeek‑OCR and PaddleOCR in speed and accuracy.

DeploymentGLM-OCROCR
0 likes · 10 min read
Why GLM-OCR Leads OCR Benchmarks: 0.9B Model Tops OmniDocBench
SpringMeng
SpringMeng
Jan 30, 2026 · Artificial Intelligence

Hands‑On Guide: Build AI Agent Chatbots on Windows with RagFlow

Programmer Xiao Meng walks through a complete Windows setup for AI‑powered customer service agents using RagFlow, covering prerequisites, Docker and Ollama installation, model download, container deployment, configuration of knowledge bases, and testing, based on five real‑world projects.

AI chatbotDockerOllama
0 likes · 7 min read
Hands‑On Guide: Build AI Agent Chatbots on Windows with RagFlow
AI Cyberspace
AI Cyberspace
Jan 29, 2026 · Artificial Intelligence

Step‑by‑Step Guide to Efficient LLM Fine‑Tuning with LoRA, QLoRA, and Llama‑Factory

This tutorial explains the concepts, methods, and practical commands for fine‑tuning large language models using efficient techniques like LoRA and QLoRA, covering model selection, resource considerations, Docker deployment, dataset preparation, training configuration, evaluation metrics, model merging, and deployment with GGUF and Ollama.

GGUFGPU memory optimizationLLM fine-tuning
0 likes · 27 min read
Step‑by‑Step Guide to Efficient LLM Fine‑Tuning with LoRA, QLoRA, and Llama‑Factory
Old Zhang's AI Learning
Old Zhang's AI Learning
Jan 25, 2026 · Artificial Intelligence

Ollama launch: One‑Command Tool Setup and New 5‑Hour Cloud Sessions

The article introduces Ollama's new "ollama launch" command, which lets users configure and start programming tools like Claude Code, OpenCode, Codex, and Droid with a single command, and explains quick‑start steps, recommended local and cloud models, and an extended five‑hour cloud coding session.

AI modelsModel SelectionOllama
0 likes · 6 min read
Ollama launch: One‑Command Tool Setup and New 5‑Hour Cloud Sessions
AI Insight Log
AI Insight Log
Jan 20, 2026 · Artificial Intelligence

Is GLM-4.7-Flash the New 30B‑Level LLM King? Open‑Source and Ollama‑Ready

GLM‑4.7‑Flash, a 30B‑parameter MoE LLM released as fully open‑source and free, delivers 30B‑class performance across six benchmarks, runs locally with a single Ollama command, and offers a faster cloud‑hosted version with modest token‑based pricing, though hardware costs still apply.

Anthropic APIGLM-4.7-FlashMixture of Experts
0 likes · 7 min read
Is GLM-4.7-Flash the New 30B‑Level LLM King? Open‑Source and Ollama‑Ready
AI Insight Log
AI Insight Log
Jan 19, 2026 · Artificial Intelligence

Run Claude Code for Free? Ollama Adds Anthropic API Compatibility

Ollama v0.14.0 now supports the Anthropic API, letting you run Claude Code locally with open‑source models like Qwen or Llama without an API key, network, or cost, and the article provides a step‑by‑step setup, SDK examples, and an objective assessment of the approach.

Anthropic APIClaude CodeLocal-LLM
0 likes · 7 min read
Run Claude Code for Free? Ollama Adds Anthropic API Compatibility
Fun with Large Models
Fun with Large Models
Jan 18, 2026 · Artificial Intelligence

Step‑by‑Step Guide to Deploying Large Language Models Locally with VLLM and Ollama

This article walks through two mainstream local deployment solutions—high‑performance VLLM for production Linux servers and lightweight Ollama for personal Windows machines—covering environment setup, model download, server launch, API testing, key configuration parameters, and the quantization technique that makes Ollama models compact.

GPU OptimizationLarge Language ModelsModel Quantization
0 likes · 18 min read
Step‑by‑Step Guide to Deploying Large Language Models Locally with VLLM and Ollama
Woodpecker Software Testing
Woodpecker Software Testing
Jan 15, 2026 · Artificial Intelligence

Step-by-Step Guide to Building Your First AI Agent: Connecting Alibaba Cloud, OpenAI, Dashscope, DeepSeek, and Ollama

This article provides a detailed, hands‑on tutorial for creating an AI agent, covering registration and API key setup for Alibaba Cloud, OpenAI, Dashscope and DeepSeek, installing and using Ollama for local model deployment, configuring CherryStudio, and implementing function‑calling and MCP techniques with full code examples.

AI AgentAlibaba CloudDashscope
0 likes · 26 min read
Step-by-Step Guide to Building Your First AI Agent: Connecting Alibaba Cloud, OpenAI, Dashscope, DeepSeek, and Ollama
Raymond Ops
Raymond Ops
Dec 16, 2025 · Artificial Intelligence

Master Multi‑GPU Load Balancing for OLLAMA: From Setup to Production

This guide walks you through configuring OLLAMA for multi‑GPU load balancing, covering hardware checks, CUDA and Docker setup, native and containerized deployment methods, core parameter tuning, advanced sharding, dynamic monitoring, troubleshooting, production best practices, and a real‑world RTX 4090 case study.

AI inferenceCUDAGPU
0 likes · 15 min read
Master Multi‑GPU Load Balancing for OLLAMA: From Setup to Production
JakartaEE China Community
JakartaEE China Community
Dec 16, 2025 · Artificial Intelligence

Build a Retrieval‑Augmented Generation (RAG) System with Langchain4j and Ollama 3

This guide walks through the importance of Retrieval‑Augmented Generation, outlines the core Langchain4j and Ollama 3 components, and provides a complete Java example—including Maven setup, document ingestion, embedding creation, similarity search, prompt construction, and response generation—to demonstrate a functional RAG pipeline.

EmbeddingLLMLangChain4j
0 likes · 9 min read
Build a Retrieval‑Augmented Generation (RAG) System with Langchain4j and Ollama 3
Code Wrench
Code Wrench
Dec 6, 2025 · Artificial Intelligence

Build a Local Go AI Agent with Ollama and DeepSeek – MVP Guide

This article walks you through creating a fully offline, extensible AI programming assistant in Go, using Ollama and DeepSeek‑R1, covering project layout, message formats, function calling, tool integration, a simple WebSocket UI, and future extension ideas.

AI AgentGoLocal-LLM
0 likes · 10 min read
Build a Local Go AI Agent with Ollama and DeepSeek – MVP Guide
JakartaEE China Community
JakartaEE China Community
Nov 18, 2025 · Artificial Intelligence

How to Build a Retrieval‑Augmented Generation (RAG) System with Langchain4j and Ollama 3

This article explains why Retrieval‑Augmented Generation improves LLM accuracy, outlines the key Langchain4j and Ollama3 components, and provides a step‑by‑step Java example—including Maven setup, document ingestion, embedding, similarity search, prompt creation, and response generation—to demonstrate a functional RAG pipeline.

EmbeddingLLMLangChain4j
0 likes · 8 min read
How to Build a Retrieval‑Augmented Generation (RAG) System with Langchain4j and Ollama 3
Raymond Ops
Raymond Ops
Sep 23, 2025 · Artificial Intelligence

Install Ollama’s Local LLM on Windows and Power It with ShellGPT

This guide walks you through installing the Ollama local large‑language‑model runtime on Windows, deploying a Gemma2 model, then setting up ShellGPT on Linux to interact with the local LLM, covering configuration, basic commands, and advanced usage examples.

AI AssistantLinuxLocal-LLM
0 likes · 6 min read
Install Ollama’s Local LLM on Windows and Power It with ShellGPT
Code Wrench
Code Wrench
Sep 22, 2025 · Artificial Intelligence

Build a Private ChatGPT on Your Laptop with Ollama, DeepSeek‑R1 and Go MCP

This guide walks you through installing Ollama, pulling the open‑source DeepSeek‑R1:1.5B model, wrapping it with a Go‑based Model Context Protocol (MCP) server, creating a client example, and enhancing the experience with Open‑WebUI while offering performance‑tuning tips.

DeepSeekGoLocal AI
0 likes · 9 min read
Build a Private ChatGPT on Your Laptop with Ollama, DeepSeek‑R1 and Go MCP
Dunmao Tech Hub
Dunmao Tech Hub
Sep 1, 2025 · Artificial Intelligence

Deploy DeepSeek‑r1 Locally with a One‑Click Ollama Script

This guide walks you through a Bash script that automatically checks for Ollama, installs it if missing, lets you choose a DeepSeek‑r1 model size, starts the Ollama service, and runs the selected model locally, complete with usage examples and a token‑cost note.

DeepSeekModel DeploymentOllama
0 likes · 7 min read
Deploy DeepSeek‑r1 Locally with a One‑Click Ollama Script
Raymond Ops
Raymond Ops
Aug 26, 2025 · Artificial Intelligence

How to Deploy DeepSeek R1 Locally: Versions, Hardware, and UI Tools

This guide explains DeepSeek R1’s model variants, hardware requirements, local installation steps using Ollama, LM Studio or Docker, and how to add visual interfaces like Open‑WebUI and Dify for a complete on‑premise AI solution.

DeepSeekDifyHardware Requirements
0 likes · 14 min read
How to Deploy DeepSeek R1 Locally: Versions, Hardware, and UI Tools
Mingyi World Elasticsearch
Mingyi World Elasticsearch
Aug 4, 2025 · Artificial Intelligence

Building Enterprise‑Grade Semantic Search with Ollama—No External APIs Required

This article walks through the complete design and implementation of a locally deployed, enterprise‑level semantic search system using Ollama for embedding generation and Easysearch for vector retrieval, covering problem analysis, architecture decisions, pipeline configuration, bulk indexing, and hybrid query execution.

EasysearchOllamalocal deployment
0 likes · 12 min read
Building Enterprise‑Grade Semantic Search with Ollama—No External APIs Required
Eric Tech Circle
Eric Tech Circle
Aug 3, 2025 · Artificial Intelligence

How to Deploy Qwen3‑Coder Locally and Boost Front‑End Development

This article explains the key improvements of Qwen3‑Coder, walks through two local deployment methods (LM Studio and Ollama), showcases front‑end coding examples, compares performance and hardware requirements, and offers practical recommendations for developers seeking an on‑premise AI coding assistant.

AI code generationLM StudioOllama
0 likes · 7 min read
How to Deploy Qwen3‑Coder Locally and Boost Front‑End Development
Code Mala Tang
Code Mala Tang
Jul 22, 2025 · Artificial Intelligence

Convert Any PDF to Clean Markdown with a Local LLM (Gemma 3)

Learn how to transform any PDF—including scanned documents—into well‑structured Markdown using a local LLM (Gemma 3 via Ollama), Python, PyMuPDF and Pillow, without cloud APIs or API keys, by converting pages to images, prompting the model, and saving the output.

GemmaLLMOllama
0 likes · 12 min read
Convert Any PDF to Clean Markdown with a Local LLM (Gemma 3)
21CTO
21CTO
Jul 22, 2025 · Artificial Intelligence

Run Powerful LLMs Locally on <8GB RAM: Top 10 Small Models & Tools

This article explains how advanced quantization and model optimization enable running strong large language models on laptops or desktops with less than 8 GB of RAM or VRAM, outlines key technical concepts, recommends local inference tools, and lists ten compact LLMs with usage commands.

LLM toolsLocal-LLMOllama
0 likes · 10 min read
Run Powerful LLMs Locally on <8GB RAM: Top 10 Small Models & Tools
MaGe Linux Operations
MaGe Linux Operations
Jul 21, 2025 · Artificial Intelligence

Master Multi‑GPU Load Balancing for OLLAMA: From Zero to Production

This guide walks you through configuring OLLAMA for multi‑GPU load balancing, covering hardware checks, CUDA setup, native and Docker deployment methods, detailed parameter tuning, advanced sharding strategies, troubleshooting, performance optimization, and production‑grade monitoring to maximize throughput and stability of large language models.

AI deploymentCUDAOllama
0 likes · 16 min read
Master Multi‑GPU Load Balancing for OLLAMA: From Zero to Production
Java Architecture Diary
Java Architecture Diary
Jun 5, 2025 · Artificial Intelligence

Unlock AI Reasoning: How Ollama’s New ‘Thinking’ Feature Works

Version 0.9.0 of Ollama introduces a ‘thinking’ control that lets users view and manage the AI model’s reasoning process, with detailed CLI commands, REST API usage, model support list, scripting options, and advanced Modelfile configurations for models like DeepSeek R1 and Qwen 3.

AI reasoningCLIDeepSeek
0 likes · 6 min read
Unlock AI Reasoning: How Ollama’s New ‘Thinking’ Feature Works
Java Architecture Diary
Java Architecture Diary
May 19, 2025 · Artificial Intelligence

How Ollama 0.7 Unlocks Local Multimodal AI with One Command

Ollama 0.7 introduces a fully re‑engineered core that brings seamless multimodal model support, lists top visual models, showcases OCR and image analysis capabilities, explains technical breakthroughs, and provides a quick three‑step guide to deploy powerful local AI vision.

AI EngineeringAI modelsOllama
0 likes · 7 min read
How Ollama 0.7 Unlocks Local Multimodal AI with One Command
Java Architecture Diary
Java Architecture Diary
Apr 29, 2025 · Artificial Intelligence

Why Qwen3 Is the New Powerhouse in Open‑Source AI Models

Qwen3 introduces a suite of open‑source models—from a 235B expert model to compact 0.6B versions—offering competitive performance against top proprietary models, multilingual support, flexible thinking modes, and low deployment requirements, with detailed usage instructions via Ollama and OpenRouter.

OllamaQwen3large language model
0 likes · 8 min read
Why Qwen3 Is the New Powerhouse in Open‑Source AI Models
Open Source Linux
Open Source Linux
Apr 14, 2025 · Artificial Intelligence

How to Deploy DeepSeek Locally: Step‑by‑Step Guide for Offline AI

This guide compares DeepSeek’s local and online versions, outlines hardware and privacy advantages of offline deployment, and provides a detailed step‑by‑step tutorial—including Ollama installation, model selection, command execution, and UI plugin setup—to help users run DeepSeek on their own machines.

AI modelDeepSeekOllama
0 likes · 6 min read
How to Deploy DeepSeek Locally: Step‑by‑Step Guide for Offline AI
Ops Development & AI Practice
Ops Development & AI Practice
Apr 6, 2025 · Industry Insights

How VS Code’s New Copilot Agent and Custom LLM Support Redefine AI‑Assisted Development

The VS Code v1.99 update introduces a Copilot Agent mode that deepens project‑level understanding and adds custom LLM integration—including OpenAI, Azure, Gemini, Anthropic, OpenRouter, and locally‑run Ollama—offering developers greater flexibility, cost control, privacy, and strategic advantages in the evolving AI‑IDE landscape.

AI IDEAI trendsCustom LLM
0 likes · 8 min read
How VS Code’s New Copilot Agent and Custom LLM Support Redefine AI‑Assisted Development
Ops Development & AI Practice
Ops Development & AI Practice
Apr 6, 2025 · Artificial Intelligence

Mastering Ollama Modelfile: Build and Customize Your Own LLM

This guide explains how to retrieve, analyze, and modify an Ollama Modelfile—using commands like `ollama show --modelfile`, dissecting key directives such as FROM, TEMPLATE, LICENSE, PARAMETER, SYSTEM, and ADAPTER—and walks through step‑by‑step creation of a custom model.

AI modelLLM customizationLoRA
0 likes · 9 min read
Mastering Ollama Modelfile: Build and Customize Your Own LLM
Qborfy AI
Qborfy AI
Mar 27, 2025 · Artificial Intelligence

How to Deploy DeepSeek‑R1 Locally with Ollama and Dify: A Step‑by‑Step Guide

This article walks through the entire process of deploying the DeepSeek‑R1 large language model on a personal machine, covering hardware requirements, Ollama installation, model download, service startup, remote access configuration, and visual UI integration with Dify, complete with concrete commands and screenshots.

DeepSeekDockerLLM deployment
0 likes · 9 min read
How to Deploy DeepSeek‑R1 Locally with Ollama and Dify: A Step‑by‑Step Guide
Alibaba Cloud Native
Alibaba Cloud Native
Mar 27, 2025 · Cloud Native

Deploy the QwQ‑32B LLM on Alibaba Cloud Function Compute with CAP in Minutes

This guide walks you through deploying the open‑source QwQ‑32B model on Alibaba Cloud Function Compute using the Cloud Application Platform (CAP), covering architecture, required services, account setup, step‑by‑step deployment, cost considerations, model interaction via Open WebUI and Chatbox, scaling configuration, and resource cleanup.

CAPFunction ComputeOllama
0 likes · 8 min read
Deploy the QwQ‑32B LLM on Alibaba Cloud Function Compute with CAP in Minutes
AI Algorithm Path
AI Algorithm Path
Mar 24, 2025 · Artificial Intelligence

How to Use Pydantic for Structured LLM Output

The article explains why LLM responses can be inconsistent, introduces Pydantic as a way to define custom output schemas, and walks through concrete examples—both with OpenAI and Ollama models—showing how to build a LangChain pipeline that parses responses into structured data.

LLMLangChainOllama
0 likes · 7 min read
How to Use Pydantic for Structured LLM Output
MaGe Linux Operations
MaGe Linux Operations
Mar 21, 2025 · Artificial Intelligence

Step‑by‑Step Guide to Install Ollama and ShellGPT for Local LLM Use

This tutorial walks you through installing Ollama on Windows, configuring and running a local large language model, then setting up ShellGPT on Linux to communicate with Ollama, including configuration files, command examples, and REPL usage, while omitting unrelated promotional content.

AI AssistantLocal-LLMOllama
0 likes · 6 min read
Step‑by‑Step Guide to Install Ollama and ShellGPT for Local LLM Use
Open Source Tech Hub
Open Source Tech Hub
Mar 13, 2025 · Artificial Intelligence

Build a Private AI Knowledge Base with Webman AI, Redis‑Stack, and Ollama

This guide walks you through setting up a private AI knowledge base using Webman AI 5.4.0, deploying Redis‑Stack, installing the illuminate/redis component, adding Ollama with DeepSeek and other embedding models, configuring Redis, importing training data, running the training process, and configuring role prompts for accurate AI responses.

DeepSeekOllamaPrivate Knowledge Base
0 likes · 6 min read
Build a Private AI Knowledge Base with Webman AI, Redis‑Stack, and Ollama
Programmer DD
Programmer DD
Mar 6, 2025 · Artificial Intelligence

Discover QwQ-32B: A 32B LLM Matching 671B DeepSeek‑R1 Performance

The QwQ-32B model, released by Alibaba Cloud, delivers DeepSeek‑R1‑level results with only 32 billion parameters, offers integrated agent capabilities, is open‑source under Apache 2.0, and can be quickly deployed locally via Ollama or integrated into Java applications using Spring AI.

AI inferenceModel DeploymentOllama
0 likes · 4 min read
Discover QwQ-32B: A 32B LLM Matching 671B DeepSeek‑R1 Performance
Tencent Technical Engineering
Tencent Technical Engineering
Mar 5, 2025 · Information Security

Detecting Critical AI Infrastructure Vulnerabilities with AI-Infra-Guard

As open‑source large language model tools like Ollama, OpenWebUI and ComfyUI gain popularity, numerous security flaws such as unauthenticated APIs, CVE‑exploits, model theft and remote code execution emerge, prompting the development of AI‑Infra‑Guard—a lightweight, cross‑platform scanner that identifies over 30 component vulnerabilities and offers both web UI and CLI modes for rapid risk assessment.

AI securityAI-Infra-GuardCVE
0 likes · 13 min read
Detecting Critical AI Infrastructure Vulnerabilities with AI-Infra-Guard
JD Tech Talk
JD Tech Talk
Mar 4, 2025 · Artificial Intelligence

Building a Local Personal Knowledge Base with Ollama, DeepSeek‑R1, AnythingLLM and Integrating Continue into VSCode

This guide walks through setting up a local personal knowledge base using Ollama, DeepSeek‑R1, and AnythingLLM, and demonstrates how to integrate the Continue AI code assistant into VSCode, covering installation, configuration, and usage tips for efficient, secure development.

AI integrationAnythingLLMDeepSeek
0 likes · 10 min read
Building a Local Personal Knowledge Base with Ollama, DeepSeek‑R1, AnythingLLM and Integrating Continue into VSCode
Efficient Ops
Efficient Ops
Feb 25, 2025 · Artificial Intelligence

How to Deploy DeepSeek R1 Locally: A Step‑by‑Step Guide for AI Enthusiasts

This guide explains what DeepSeek R1 is, compares its full and distilled versions, details hardware requirements for Linux, Windows, and macOS, and provides step‑by‑step instructions for local deployment using Ollama, LM Studio, Docker, and visual interfaces like Open‑WebUI and Dify.

AI modelDeepSeekDify
0 likes · 9 min read
How to Deploy DeepSeek R1 Locally: A Step‑by‑Step Guide for AI Enthusiasts
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Feb 22, 2025 · Artificial Intelligence

Deploying DeepSeek Locally with Ollama, Building Personal and Organizational Knowledge Bases, and Integrating with Spring AI

This guide explains how to locally deploy the DeepSeek large‑language model using Ollama on Windows, macOS, and Linux, configure model storage and CORS, build personal and enterprise RAG knowledge bases with AnythingLLM and Open WebUI, and integrate the model into a Spring AI application via Docker and Docker‑Compose.

DeepSeekDockerKnowledge Base
0 likes · 16 min read
Deploying DeepSeek Locally with Ollama, Building Personal and Organizational Knowledge Bases, and Integrating with Spring AI
Data Thinking Notes
Data Thinking Notes
Feb 20, 2025 · Artificial Intelligence

How to Deploy DeepSeek R1 671B Model Locally with Ollama: A Step‑by‑Step Guide

This article provides a comprehensive tutorial on locally deploying the 671‑billion‑parameter DeepSeek R1 model using Ollama, covering model selection, hardware requirements, dynamic quantization, detailed installation steps, performance observations, and practical recommendations for consumer‑grade hardware.

AI model optimizationDeepSeekDynamic Quantization
0 likes · 14 min read
How to Deploy DeepSeek R1 671B Model Locally with Ollama: A Step‑by‑Step Guide
Top Architect
Top Architect
Feb 20, 2025 · Artificial Intelligence

Deploying DeepSeek R1 671B Model Locally with Ollama and Dynamic Quantization

This guide explains how to download, quantize, and run the full‑size 671‑billion‑parameter DeepSeek R1 model on local hardware using Ollama, covering model selection, hardware requirements, step‑by‑step deployment commands, optional web UI setup, performance observations, and practical recommendations.

DeepSeekDynamic QuantizationOllama
0 likes · 16 min read
Deploying DeepSeek R1 671B Model Locally with Ollama and Dynamic Quantization