Tagged articles

RTX 4090

7 articles · Page 1 of 1

May 13, 2026 · Artificial Intelligence

Super‑Charging MiniCPM‑V 4.6 on One RTX 4090: 1B‑Parameter Multimodal Model Sets New Efficiency Bar

MiniCPM‑V 4.6, a 1.3 B‑parameter multimodal LLM, outperforms larger rivals such as Qwen3.5‑0.8B and Gemma 4 on both accuracy and speed, thanks to early ViT token compression and 4×/16× visual token reduction, delivering sub‑100 ms latency and over 2.6 k token/s throughput on a single RTX 4090 while also running offline on mobile devices.

MiniCPM-VRTX 4090Token Compression

0 likes · 16 min read

Super‑Charging MiniCPM‑V 4.6 on One RTX 4090: 1B‑Parameter Multimodal Model Sets New Efficiency Bar

Old Zhang's AI Learning

Apr 22, 2026 · Artificial Intelligence

Testing NVIDIA‑Accelerated Qwen3.6‑35B on Dual RTX 4090: Real‑World Performance

This article evaluates the Red Hat‑produced NVFP4‑quantized Qwen3.6‑35B model deployed with vLLM inside Docker on a dual‑RTX 4090 server, presenting accuracy gains, memory usage, initialization times, GPU compatibility notes, and practical deployment recommendations.

DockerNVFP4Quantization

0 likes · 8 min read

Testing NVIDIA‑Accelerated Qwen3.6‑35B on Dual RTX 4090: Real‑World Performance

Old Zhang's AI Learning

Mar 18, 2026 · Artificial Intelligence

Running Claude‑Opus‑4.6‑Distilled Qwen3.5 27B on a Single RTX 4090 with llama.cpp: 46 tokens/s Performance

The article details a hands‑on test of the Claude‑Opus‑4.6‑distilled Qwen3.5 27B model running on a single RTX 4090 via llama.cpp, showing a steady 46 tokens per second generation speed, a 64K context window, and a step‑by‑step Docker‑based setup while comparing it to GLM‑4.7‑Flash‑AWQ‑4bit and discussing llama.cpp’s limitations for multi‑GPU inference.

Claude OpusDockerLLM Inference

0 likes · 5 min read

Running Claude‑Opus‑4.6‑Distilled Qwen3.5 27B on a Single RTX 4090 with llama.cpp: 46 tokens/s Performance

Old Zhang's AI Learning

Mar 2, 2026 · Artificial Intelligence

Why the Qwen3.5 Series Makes Qwen3.5-27B the No‑Brainer Choice

The author reviews the Qwen3.5 model family, showing that the 27‑billion‑parameter dense Qwen3.5-27B offers the best balance of size, stability, low‑cost local deployment, and comprehensive capabilities, making it the default pick for most users.

AI benchmarkingLarge Language ModelQuantization

0 likes · 6 min read

Why the Qwen3.5 Series Makes Qwen3.5-27B the No‑Brainer Choice

Design Hub

Dec 27, 2025 · Artificial Intelligence

Speed vs. Quality: Z-Image + Nunchaku Boosts Portrait Generation by 300%

Testing shows that adding the open‑source Nunchaku accelerator to the Z‑Image portrait model triples generation speed on an RTX 4090, but the faster output exhibits noticeable drops in facial detail and overall aesthetic, prompting a detailed walkthrough of installation, model download, and workflow integration.

AI image generationComfyUINunchaku

0 likes · 6 min read

Speed vs. Quality: Z-Image + Nunchaku Boosts Portrait Generation by 300%

Architects' Tech Alliance

Aug 26, 2024 · Game Development

Comprehensive GPU Benchmark of Black Myth: Wukong Across Multiple Configurations and Resolutions

An extensive GPU benchmark of the Unreal Engine‑powered game Black Myth: Wukong evaluates performance across 41 graphics cards, multiple resolutions, and ray‑tracing settings, revealing which GPUs deliver playable frame rates with and without DLSS/FSR/TSR and frame‑generation technologies.

Black MythDLSSFSR

0 likes · 18 min read

Comprehensive GPU Benchmark of Black Myth: Wukong Across Multiple Configurations and Resolutions

IT Services Circle

Oct 29, 2022 · Fundamentals

Multiple RTX 4090 GPU Fires Reported Within Two Weeks of Launch: Causes and Mitigation

Within two weeks of the RTX 4090 launch, five user-reported cases of GPU self‑ignition were documented, all involving the 16‑pin power connector, prompting analysis of cable quality, connector design, load conditions, and recommendations for using ATX 3.0 power supplies with native 16‑pin cables.

12VHPWRATX 3.0GPU fire

0 likes · 6 min read

Multiple RTX 4090 GPU Fires Reported Within Two Weeks of Launch: Causes and Mitigation

RTX 4090

Super‑Charging MiniCPM‑V 4.6 on One RTX 4090: 1B‑Parameter Multimodal Model Sets New Efficiency Bar

Testing NVIDIA‑Accelerated Qwen3.6‑35B on Dual RTX 4090: Real‑World Performance

Running Claude‑Opus‑4.6‑Distilled Qwen3.5 27B on a Single RTX 4090 with llama.cpp: 46 tokens/s Performance

Why the Qwen3.5 Series Makes Qwen3.5-27B the No‑Brainer Choice

Speed vs. Quality: Z-Image + Nunchaku Boosts Portrait Generation by 300%

Comprehensive GPU Benchmark of Black Myth: Wukong Across Multiple Configurations and Resolutions

Multiple RTX 4090 GPU Fires Reported Within Two Weeks of Launch: Causes and Mitigation

Super‑Charging MiniCPM‑V 4.6 on One RTX 4090: 1B‑Parameter Multimodal Model Sets New Efficiency Bar

Testing NVIDIA‑Accelerated Qwen3.6‑35B on Dual RTX 4090: Real‑World Performance

Running Claude‑Opus‑4.6‑Distilled Qwen3.5 27B on a Single RTX 4090 with llama.cpp: 46 tokens/s Performance