Logic Reasoning — 2 Technical Articles

Mar 5, 2025 · Artificial Intelligence

Why My 0.5B LLM’s Reasoning Collapsed During RLHF on Logic Puzzles

The author experiments with reinforcement‑learning‑from‑human‑feedback on a 0.5B Qwen instruct model using Logic‑RL and Open‑R1, discovers that reward mis‑design and curriculum learning cause the model to produce overly short or incorrect reasoning chains on knight‑and‑knave puzzles, and analyses the underlying causes.

Artificial IntelligenceLarge Language ModelLogic Reasoning

0 likes · 11 min read

Why My 0.5B LLM’s Reasoning Collapsed During RLHF on Logic Puzzles

Java Tech Enthusiast

Feb 26, 2025 · Artificial Intelligence

Claude 3.7 Sonnet: How It Crushes Coding, Physics Simulations, and Logic Puzzles

Claude 3.7 Sonnet demonstrates unprecedented programming speed, realistic physics simulation, advanced reasoning on misleading benchmarks, and strong productivity tools, while Anthropic secures a $3.5 billion funding round, making it a standout AI model in both technical capability and market impact.

AI model evaluationClaude 3.7Logic Reasoning

0 likes · 11 min read

Claude 3.7 Sonnet: How It Crushes Coding, Physics Simulations, and Logic Puzzles