Why Databricks’ Open‑Source DBRX LLM Is Outpacing GPT‑3.5 and Llama 2

Databricks unveiled the open‑source DBRX large language model, which leverages a mixed‑expert architecture to deliver faster, more cost‑effective inference and beats leading open‑source and proprietary models like Llama 2, Mixtral‑8x7B, and GPT‑3.5 on multiple benchmarks.

21CTO
21CTO
21CTO
Why Databricks’ Open‑Source DBRX LLM Is Outpacing GPT‑3.5 and Llama 2
Reading: Databricks, the data‑lake and analytics platform, launched an open‑source foundational large language model, hoping enterprises will adopt its tools to keep up with the large‑model wave and start selling shovels.

On March 27, AI startup Databricks announced that its Mosaic Research team released the open‑source general‑purpose LLM DBRX.

Databricks co‑founder and CEO Ali Ghodsi said they are excited about DBRX for three main reasons: it beats open‑source models on state‑of‑the‑art industry benchmarks, it outperforms GPT‑3.5 on most tests, and its mixture‑of‑experts (MoE) architecture makes it extremely fast per token and cost‑effective.

Customers can access DBRX via API, pre‑train their own DBRX‑style models from scratch, or continue training from selected checkpoints using the same tools and techniques.

The company, built around Apache Spark, published benchmarks claiming DBRX surpasses open‑source competitors in language understanding, programming, mathematics, and logic.

The development team also claims DBRX beats OpenAI’s proprietary GPT‑3.5 on the same metrics.

DBRX was developed by Mosaic AI, which Databricks acquired for $1.3 billion, and was trained on Nvidia DGX Cloud.

Databricks says DBRX’s efficiency comes from a MoE architecture that activates only a subset of its 132 billion parameters—about 36 billion at any given time—allowing faster token generation and lower service costs.

Vice‑President of Marketing Joel Minnick described the model as “almost real‑time” compared with typical chatbots that require noticeable waiting for answers.

DBRX is freely available on GitHub and Hugging Face:

GitHub: https://github.com/databricks/dbrx Hugging Face: https://huggingface.co/collections/databricks/dbrx-6601c0852a0cdd3c59f71962

Databricks encourages customers to use the model as a foundation for their own large models, promising continued improvements to chatbots and internal Q&A systems while showcasing how DBRX was built with Databricks’ proprietary tools.

The data for DBRX is assembled using Apache Spark and Databricks Notebook, managed with Unity Catalog, and tracked with MLflow.

According to Minnick, enterprises delay large‑model investments due to concerns over third‑party ownership and governance, as handing over data without model‑weight ownership hampers end‑to‑end control.

“We are building an extremely efficient model that enterprises can embed into their applications for specific use cases,” he added.

Amalgam Insights CEO Hyun Park highlighted DBRX’s significance, noting that Databricks demonstrates a step‑by‑step construction process that other companies can follow and fine‑tune.

“In end‑to‑end model tuning, testing, and operationalization, reproducibility, visibility, and model ownership are crucial,” Park said.

Park also pointed out that Databricks has built over 50,000 custom models for customers, making the announcement particularly noteworthy from an enterprise‑IT perspective.

DBRX is released amid Databricks’ evolving competition with Microsoft, its long‑term strategic partner for Azure Databricks, which integrates tightly with Microsoft’s cloud services.

Since 2017, Microsoft has entered the lakehouse market and, through its $10 billion OpenAI partnership, promises enterprise‑grade large models, while its Fabric environment can mirror data from Azure Cosmos DB and Azure SQL DB without moving the data.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

aiMixture of Expertslarge language modelopen-source LLMDatabricksDBRX
21CTO
Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.