GPT‑5.2 vs Gemini 3 Pro: Coding Tests, NeurIPS 2025 Paper Insights, and RAG Refactor
The article evaluates GPT‑5.2 and Gemini 3 Pro on real‑world coding tasks, analyzes trends from the 6000 papers presented at NeurIPS 2025, and demonstrates how to extract and refactor the tree‑building component of the open‑source RAPTOR RAG system into an independent module.
Introduction
The author, using the Cursor IDE, compares the newly released GPT‑5.2 (high‑window, high‑inference mode) with Google Gemini 3 Pro on several practical coding scenarios, then shifts to a data‑driven analysis of NeurIPS 2025 papers and finally shows a concrete RAG code‑refactoring workflow.
1. Real‑World Coding Tests ("Fireworks" Demo)
Both models were prompted to generate a fireworks animation. GPT‑5.2 produced visually impressive results but missed the final single‑spark effect, while Gemini 3 Pro generated a longer animated GIF that more closely followed the instruction.
2. NeurIPS 2025 Paper Analysis
NeurIPS 2025 featured nearly 6,000 papers. Using Cursor, the author extracted abstracts, authors, and categories, then aggregated keyword frequencies to identify hot topics. The resulting report, titled “NeurIPS 2025 Technical Analysis: From Generation to Inference, the Rise of Intelligent Agents,” highlights dominant themes such as large‑scale multimodal models and agent‑centric architectures.
3. RAG Code Refactor (RAPTOR Tree Builder)
The open‑source RAPTOR project provides a hierarchical RAG index with tree construction and retrieval logic. The author needed a standalone file containing only the tree‑building code, without external imports. By extracting the relevant classes (Node, Tree) and consolidating them into raptor_tree_builder.py, the core functionality was preserved while simplifying integration.
# Example placeholder for the extracted tree‑builder logic
class Node:
def __init__(self, value):
self.value = value
self.children = []
class Tree:
def __init__(self, root):
self.root = root
# ... additional methods ...The refactored file was tested with both PowerShell and Python‑c commands; the original Python‑c execution failed, but PowerShell succeeded after splitting the long command into shorter segments.
Conclusion
GPT‑5.2 shows strong generation capabilities but still struggles with precise instruction following in some coding tasks. Gemini 3 Pro offers more reliable visual output. The NeurIPS 2025 analysis reveals a shift toward agent‑based multimodal systems, and the RAPTOR refactor demonstrates a practical method for isolating RAG components for easier reuse.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
