Tech | Visa | Scholarship/School | Info Place

Databricks spends $10M developing new DBRX generative AI model, but it can’t beat GPT-4

If you wanted to boost the profile of your big tech company and had $10 million to spend, how would you spend it? In a Super Bowl ad? F1 sponsorship?

you Can Use it to train generative artificial intelligence models. While generative modeling is not marketing in the traditional sense, it attracts attention and is increasingly flowing toward the supplier’s primary products and services.

See Databricks’ DBRX, a new generative AI model announced today similar to OpenAI’s GPT family and Google’s Gemini. The base (DBRX Base) and fine-tuned (DBRX Instruct) versions of DBRX are available for research and commercial use on GitHub and the AI ​​development platform Hugging Face, and can be run and tuned on public, custom, or other proprietary data.

“DBRX is trained to be very useful and can provide information on a variety of topics,” Naveen Rao, vice president of generative AI at Databricks, told TechCrunch. “DBRX is optimized and tuned for English language use, but is capable of conversation and translation. into multiple languages ​​such as French, Spanish and German.”

Databricks describes DBRX as “open source,” similar to “open source” models like Meta’s Llama 2 and AI startup Mistral’s model. (Whether these models actually meet the definition of open source is the subject of heated debate.)

Databricks says it spent around $10 million and 8 months training DBRX, which it claims (quoted in a press release) “outperforms other databases”[s] All existing open source models comply with standard benchmarks. “

But – and here’s the marketing conundrum – unless you’re a Databricks customer, using DBRX is very difficult.

That’s because, in order to run DBRX in a standard configuration, you need a server or PC with at least four Nvidia H100 GPUs. An H100 costs thousands of dollars—probably more. For the average business, this may be a small sum of money, but for many developers and individual entrepreneurs, it is out of reach.

There are also fine prints to boot. Databricks said the company, which has more than 700 million active users, will face “certain restrictions” similar to Llama 2’s Meta, and that all users will have to agree to terms to ensure they use DBRX “responsibly.” (As of publication, Databricks has not proactively revealed the specifics of these terms.)

Databricks launched the Mosaic AI Foundation Model product as a managed solution for these obstacles, providing a training stack for fine-tuning DBRX based on custom data, in addition to running DBRX and other models. Rao suggested that customers could host DBRX privately using Databricks’ model services offering or work with Databricks to deploy DBRX on the hardware of their choice.

Rao added:

We are committed to making the Databricks platform the best choice for custom model building, so the ultimate benefit of Databricks is more users on our platform. DBRX is a demonstration of our best-in-class pre-training and tuning platform, which customers can use to build their own models from scratch. It’s an easy way for customers to get started with Databricks Mosaic AI’s generative AI tools. DBRX is powerful out of the box, can be tuned to achieve outstanding performance on specific tasks, and is more economical than larger closed models.

Databricks claims DBRX runs 2x faster than Llama 2, thanks in part to its Hybrid of Experts (MoE) architecture. MoE (DBRX has something in common with Llama 2, Mistral’s new model, and Google’s recently announced Gemini 1.5 Pro) essentially breaks down data processing tasks into subtasks and then delegates those subtasks to smaller, specialized “experts” “Model.

Most Department of Education models have eight experts. DBRX has 16, which Databricks says improves quality.

However, quality is relative.

While Databricks claims that DBRX outperforms Llama 2 and Mistral’s models on some language understanding, programming, math, and logic benchmarks, in most areas outside of niche use cases like database programming, DBRX falls short of arguably the leading edge. Generation of the GPT-4 language of the generative AI model OpenAI.

Rao acknowledged that DBRX has other limitations, namely that it, like all other generative AI models, can fall victim to “hallucinated” answers to queries, despite Databricks’ work on security testing and red teaming. Because the model is only trained to associate words or phrases with certain concepts, if those associations aren’t completely accurate, its responses won’t always be accurate either.

Additionally, unlike some of the latest flagship generative AI models, including Gemini, DBRX is not multi-modal. (It can only process and generate text, not images.) And we don’t know exactly what data sources were used to train it; Rao revealed only that no Databricks customer data was used when training DBRX.

“We trained DBRX using a large amount of data from different sources,” he added. “We used open data sets that the community knows, loves and uses every day.”

I asked Rao whether any of the DBRX training data sets were copyrighted or licensed, or showed clear signs of bias (such as racial bias), but he didn’t answer directly, saying only, “We’re always careful about the data we use. , and conducted red team exercises to improve the model’s weaknesses.” Generative AI models have a tendency to duplicate training data, which is a problem for commercial users who have trained their models on unlicensed, copyrighted, or very obviously biased data. Major concerns. In the worst-case scenario, users can get into ethical and legal trouble by inadvertently incorporating IP infringement or biased work in models into their projects.

Some companies that train and publish generative AI models offer policies that cover legal fees arising from possible infringement. Databricks doesn’t have one yet — Rao said the company is “exploring possible scenarios.”

Given DBRX’s shortcomings in this and other areas, the model seems like a tough sell to anyone other than current or potential Databricks customers. Databricks’ competitors in the generative AI space, including OpenAI, offer equally compelling technology at very competitive prices. Many generative AI models are closer to the commonly understood definition of open source than DBRX.

Rao promised that Databricks will continue to refine DBRX and release new versions as the company’s Mosaic Labs R&D team (the team behind DBRX) researches new approaches to generative AI.

“DBRX is driving the open source modeling field forward and challenging future models to be built more efficiently,” he said. “We will release variants as we apply technology to improve the quality of our output in terms of reliability, safety and bias… We view the open model as a platform on which customers can build custom capabilities using our tools .”

Judging from the current situation of DBRX compared to its peers, there is still a long way to go.

#Databricks #spends #10M #developing #DBRX #generative #model #beat #GPT4

Leave a Reply

Your email address will not be published. Required fields are marked *