How OASIS Achieves State‑of‑the‑Art Code Search with Just 5M Tokens
Fast.ai's Kwaipilot team unveiled OASIS, a 1.3B‑parameter code‑embedding model that, using only 5 million tokens, outperforms larger OpenAI embeddings across CodeSearchNet, CoSQA and AdvTest benchmarks, thanks to repository‑level program analysis, synthetic data generation, and a fused loss function.