Tech | Visa | Scholarship/School | Info Place

Google open source tools support AI model development

In a typical year, Cloud Next (one of Google’s two major annual developer conferences, the other being I/O) features almost exclusively hosted and closed-source, closed-API products and services. But this year, whether to cultivate developer goodwill or advance its ecosystem ambitions (or both), Google has launched a number of open source tools primarily designed to support generative AI projects and infrastructure.

The first is MaxDiffusion, which Google actually quietly released in February, and is a collection of reference implementations of various diffusion models (such as the image generator Stable Diffusion) that run on XLA devices. “XLA” stands for Accelerated Linear Algebra, an admittedly awkward acronym that refers to a technology that optimizes and accelerates specific types of AI workloads, including fine-tuning and serving.

Google’s own tensor processing unit (TPU) is an XLA device, as are recent Nvidia GPUs.

In addition to MaxDiffusion, Google also launched Jetstream, a new engine for running generative AI models, specifically text generation models (so no stable diffusion). Currently limited to support for TPUs with GPU compatibility (expected to be launched in the future), Jetstream offers up to 3x the “price/performance” for models like Google’s own Gemma 7B and Meta’s Llama 2, Google claims.

“As customers move AI workloads into production, the need for cost-effective, high-performance inference stacks continues to grow,” Mark Lohmeyer, general manager of cloud computing and machine learning infrastructure at Google, wrote in a blog post shared with TechCrunch. “JetStream helps meet this need… and includes optimizations for popular open models such as Llama 2 and Gemma.”

Now, a “3x” improvement is quite an exaggeration, but it’s unclear how Google arrived at that number. Which generation of TPU is used? Compared to which baseline engine? Anyway, how is “performance” defined here?

I’ve asked Google all of these questions and will update this article if I receive a response.

The penultimate addition to Google’s list of open source contributions is the addition of MaxText, Google’s collection of text generation AI models for cloud TPUs and Nvidia GPUs. MaxText now includes Gemma 7B, OpenAI’s GPT-3 (the predecessor to GPT-4), Llama 2, and models from AI startup Mistral — all of which Google says can be customized and fine-tuned to developers’ needs.

We’ve made a lot of optimizations [the models’] “We work closely with Nvidia to optimize the performance of large GPU clusters,” Lohmeyer said. “These Improvements maximize GPU and TPU utilization, enabling greater energy efficiency and cost optimization. “

Finally, Google partnered with AI startup Hugging Face to create the Optimum TPU, which provides tools to bring certain AI workloads to the TPU. Google says its goal is to reduce the barriers to entry for bringing generative AI models to TPU hardware, specifically text generation models.

But for now, Optimum TPU is pretty rudimentary. The only applicable model is the Gemma 7B. Optimum TPU does not yet support training generative models on the TPU – only running them.

A promising improvement from Google.

#Google #open #source #tools #support #model #development

Leave a Reply

Your email address will not be published. Required fields are marked *