Artificial Intelligence 15 min read

Can Open‑Source LLMs Overtake Google and OpenAI in the AI Arms Race?

An analysis of a leaked Google internal document reveals how open‑source large language models, low‑cost fine‑tuning techniques like LoRA, and rapid community innovation are reshaping the AI competition, challenging the dominance of both Google and OpenAI and prompting a strategic rethink.

Programmer DD

May 6, 2023

Can Open‑Source LLMs Overtake Google and OpenAI in the AI Arms Race?

We Have No Moat, OpenAI Also Has None

The AI large‑model battle sparked by ChatGPT has been raging for months, with the industry closely watching the rivalry between OpenAI and Google.

Although Google laid the groundwork with the 2017 Transformer and dazzled the field with LaMDA in 2021, many expected Google to retain the throne; instead, OpenAI surged ahead, leaving Google playing catch‑up.

Leaked Google Internal Insight

A recently leaked internal Google document states, "We didn't win the competition, OpenAI didn't either. While we were arguing, a third party quietly stole our lunch – open source."

Open‑Source Momentum

In early March, the community obtained Meta's LLaMA model, which, despite lacking instruction tuning or RLHF, was quickly recognized for its potential. Within weeks, numerous variants emerged, adding instruction tuning, quantization, quality improvements, multimodality, and RLHF.

These open‑source models demonstrated that high‑quality LLMs could be built with modest resources—$100 and a 13B‑parameter model achieved results that would cost Google $10 million and 540B parameters.

Key advantages of open‑source include:

Running LLMs on phones (e.g., Pixel 6 at 5 tokens/second).

Scalable personal AI fine‑tuned on a laptop overnight.

Responsible release practices that avoid restrictive controls.

Multimodal capabilities with fast training times (e.g., ScienceQA SOTA in 1 hour).

Why This Was Predictable

The resurgence of open‑source LLMs mirrors the earlier boom in open‑source image generation (the "Stable Diffusion moment"). Low‑rank adaptation (LoRA) dramatically reduced fine‑tuning costs, enabling rapid experimentation on consumer‑grade hardware.

LoRA reduces model‑update matrices by thousands of times, allowing cheap, fast personalization that even Google’s internal teams have underutilized.

Training From Scratch Is Costly

Training massive models from scratch discards pre‑trained knowledge and iterative improvements, making it prohibitively expensive compared to the swift, community‑driven advances in open‑source.

Investing in full retraining should be reserved for cases where architectural changes truly prevent weight reuse; otherwise, incremental refinement preserves prior capabilities.

If We Iterate Small Models Faster, Large Models Lose Their Edge

LoRA updates can be performed for about $100, enabling anyone to create and distribute a model within a day. The cumulative effect of many cheap fine‑tunings quickly overcomes the initial size disadvantage, making large‑scale models less competitive.

Data Quality Beats Data Quantity

Training on small, highly curated datasets yields strong results, showing that scaling laws have flexibility. High‑quality synthetic datasets are openly available, reducing reliance on massive proprietary data.

Direct Competition with Open‑Source Is a Losing Proposition

If a free, high‑quality alternative exists, users are unlikely to pay for Google’s products. Open‑source innovation proceeds faster because it isn’t constrained by licensing or corporate control.

We Need Them More Than They Need Us

Google’s secret‑keeping strategy is unstable; talent moves between companies, and external research outpaces internal efforts. Embracing open‑source collaboration can mitigate this talent drain.

Personal vs. Corporate Licensing

Open‑source models built on leaked weights allow individuals to innovate without legal hurdles, accelerating adoption and improvement.

Having an Ecosystem: Let Open‑Source Serve Us

Meta benefits from the free labor of the open‑source community, integrating innovations into its products. Google can similarly leverage open‑source ecosystems to maintain thought leadership.

Conclusion: OpenAI’s Position

OpenAI’s closed approach appears unfair compared to the open‑source momentum. As researchers leave for open projects, the advantage of proprietary models diminishes. Unless OpenAI changes its stance, open‑source alternatives are poised to surpass it.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

LoRA Google OpenAI AI competition AI Strategy open-source LLM

Written by

Programmer DD

A tinkering programmer and author of "Spring Cloud Microservices in Action"

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.