Tagged articles

Qwen2.5

10 articles · Page 1 of 1
Machine Heart
Machine Heart
Apr 27, 2026 · Artificial Intelligence

ACL 2026: Unveiling a Predictive Scaling Law for Reinforcement Learning Fine‑Tuning of Large Models

The paper presents a systematic empirical study that derives a power‑law scaling formula for reinforcement‑learning‑after‑training of large language models, demonstrating accurate inter‑ and intra‑model performance prediction, learning‑efficiency saturation, data‑reuse benefits, and cross‑architecture validity.

Data ReuseLarge Language ModelsLlama 3
0 likes · 11 min read
ACL 2026: Unveiling a Predictive Scaling Law for Reinforcement Learning Fine‑Tuning of Large Models
Ubuntu
Ubuntu
Jan 24, 2026 · Artificial Intelligence

Unlock Full‑Stack AI Coding on Ubuntu with Ollama and CC Switch

This step‑by‑step guide shows how to replace cloud‑based AI coding tools with a private, zero‑cost workflow on Ubuntu by installing Ollama, configuring systemd, adding DeepSeek or Qwen2.5 models, installing Claude, Codex and Gemini CLIs, and routing them through CC Switch.

AI codingCC SwitchClaude Code
0 likes · 7 min read
Unlock Full‑Stack AI Coding on Ubuntu with Ollama and CC Switch
Fun with Large Models
Fun with Large Models
Jun 12, 2025 · Artificial Intelligence

Implement GRPO to Give LLMs Reasoning Ability with Qwen2.5‑0.5B

This article explains the GRPO reinforcement‑learning algorithm, shows its core idea of internal group competition without a separate evaluator model, and provides a complete, step‑by‑step code walkthrough—including environment setup, dataset preparation, reward‑function design, training configuration, and evaluation—using the Qwen2.5‑0.5B‑Instruct model on the GSM8K math dataset.

GRPOGSM8KQwen2.5
0 likes · 23 min read
Implement GRPO to Give LLMs Reasoning Ability with Qwen2.5‑0.5B
Software Engineering 3.0 Era
Software Engineering 3.0 Era
Feb 6, 2025 · Artificial Intelligence

Training an Inference Model Rivaling OpenAI o1 and DeepSeek R1 for Under $50 in 26 Minutes

Researchers from Stanford and Washington trained the s1 inference model in just 26 minutes using under $50 of cloud credits, achieving performance comparable to OpenAI's o1 and DeepSeek's R1 by building a curated 1,000‑sample dataset and a budget‑enforced test‑time scaling algorithm.

AI benchmarkingQwen2.5budget enforcement
0 likes · 7 min read
Training an Inference Model Rivaling OpenAI o1 and DeepSeek R1 for Under $50 in 26 Minutes
Alibaba Cloud Native
Alibaba Cloud Native
Dec 26, 2024 · Cloud Computing

Deploy Qwen2.5 LLM on Alibaba Cloud Function Compute: A Step‑by‑Step Guide

This guide explains how to deploy the Qwen2.5 large language model on Alibaba Cloud Function Compute using Ollama and Open WebUI, covering model selection, resource configuration, deployment steps, interface setup, multilingual capabilities, and automatic scaling for high‑concurrency workloads.

AI model deploymentCloud ComputingFunction Compute
0 likes · 10 min read
Deploy Qwen2.5 LLM on Alibaba Cloud Function Compute: A Step‑by‑Step Guide
NewBeeNLP
NewBeeNLP
Dec 23, 2024 · Artificial Intelligence

What’s New in Qwen2.5? A Deep Dive into the Latest LLM Advances

The Qwen2.5 Technical Report introduces a new series of large language models with up to 72 B parameters, expanded pre‑training data to 18 trillion tokens, advanced supervised fine‑tuning and reinforcement learning pipelines, and demonstrates strong performance across comprehension, reasoning, coding, and long‑context tasks.

LLMLarge Language ModelQwen2.5
0 likes · 5 min read
What’s New in Qwen2.5? A Deep Dive into the Latest LLM Advances