Tech | Visa | Scholarship/School | Info Place

Etched is developing an AI chip that runs just one model

As generative AI touches more and more industries, companies that produce chips that run these models benefit greatly. Nvidia, in particular, has a huge influence, accounting for about 70% to 95% of the AI ​​chip market. Cloud providers from Meta to Microsoft have invested billions of dollars in Nvidia GPUs, fearing that they will fall behind in generative AI.

For understandable reasons, generative AI vendors are not happy with the status quo. Their success depends largely on the willingness of mainstream chipmakers. So, along with opportunistic venture capital firms, they are looking for promising start-ups to challenge the AI ​​chip giants.

Etched is one of many alternative chip companies vying for a spot, but it’s also one of the most interesting. Etched is just two years old, founded by a pair of Harvard dropouts, Gavin Uberti (formerly of OctoML and Xnor.ai) and Chris Zhu, who, along with Robert Wachen and former Cypress Semiconductor CTO Mark Ross, sought to create a chip that could do one thing: run AI models.

This is not uncommon. Many startups and tech giants have or are developing chips specifically for running AI models, also known as inference chips. Meta has MTIA, Amazon has Graviton and Inferentia, and so on. But Etched’s chip yes Unique in that they only run a single type of model: Transformers.

Transformer was proposed by the Google research team in 2017 and has now become the mainstream generative AI model architecture.

Transformers are the basis of OpenAI’s video generation model Sora. They are at the heart of text generation models like Anthropic’s Claude and Google’s Gemini. They also power art generators like the latest version of Stable Diffusion.

“In 2022, we’re betting on Transformers taking over the world,” Etched CEO Uberti told TechCrunch in an interview. “We’ve reached a point in the evolution of AI where specialized chips that outperform general-purpose GPUs are inevitable — and technology decision-makers around the world know it.”

Etched’s chip, called Sohu, is an ASIC (application-specific integrated circuit), a chip tailored for a specific application, in this case to run a transformer. Uberti claims that Sohu, manufactured using TSMC’s 4nm process, can deliver better inference performance than GPUs and other general-purpose AI chips while consuming less energy.

Uberti said: “When running text, image and video converters, Sohu is an order of magnitude faster than Nvidia’s next-generation Blackwell GB200 GPU, and costs less. One Sohu server can replace 160 H100 GPUs… For enterprise leaders who need specialized chips, Sohu will be a more economical, efficient and environmentally friendly choice.”

How did Sohu achieve all this? There are several ways, but the most obvious and intuitive way is to simplify the inference hardware and software pipeline. Since Sohu does not run non-Transformer models, the Etched team was able to eliminate hardware components that are not related to Transformers, while cutting out the software overhead traditionally used to deploy and run non-Transformers.

Etching
Etched’s chart compares the hardware performance of an open-model Llama 70B running Meta.
Image Source: Etching

Etched comes at a turning point in the race for generative AI infrastructure. In addition to cost issues, the GPUs and other hardware components needed to run models at scale today consume a lot of power.

Goldman Sachs predicts that AI will drive a 160% increase in data center electricity demand by 2030, leading to a sharp rise in greenhouse gas emissions. Meanwhile, researchers at the University of California, Riverside, estimate that global AI use could cause data centers to consume 1.1 trillion to 1.7 trillion gallons of fresh water by 2027, impacting local resources. (Many data centers use water to cool servers.)

Uberti optimistically — or hyperbolically, depending on how you interpret it — sees Sohu as a solution to the industry’s consumer problems.

“Simply put, our future customers can’t afford not to switch to Sohu,” Uberti said. “Companies are willing to bet on Etched because speed and cost are critical to the AI ​​products they’re trying to build.”

But assuming Etched can achieve its goal of bringing Sohu to the mass market in the coming months, can it succeed with so many competitors nipping at its heels?

While Etched currently lacks direct competition, AI chip startup Perceive recently previewed a processor with transformer hardware acceleration. Groq has also made a significant investment in transformer-specific optimizations for its ASIC.

Competition aside, what if Transformers fall out of favor someday? In that case, Uberti says, Etched would do the obvious thing: design a new chip. That makes sense. But given how long it took Sohu to get there, it’s a pretty radical step back.

None of these concerns stopped investors from pouring huge amounts of money into Etched.

Today, Etched announced that it has closed a $120 million Series A round co-led by Primary Venture Partners and Positive Sum Ventures. This brings Etched’s total funding to $125.36 million, and includes participation from heavyweight angel investors including Peter Thiel (Uberti, Zhu, and Wachen are all Thiel Fellowship alumni), GitHub CEO Thomas Dohmke, Cruise (and Bot Company) co-founder Kyle Vogt, and Quora co-founder Charlie Cheever.

Those investors presumably thought Etched had a decent chance of successfully scaling its server sales business. Perhaps it did — Uberti claimed that unnamed customers had preordered “tens of millions of dollars” in hardware so far. Uberti said the upcoming Sohu Developer Cloud, which will let customers preview Sohu through an online interactive playground, should drive more sales.

Still, it seems too early to tell whether that will be enough to propel Etched and its 35-person team toward the future envisioned by the company’s co-founders. The AI ​​chip space can be unforgiving at the best of times — see the near-failures of high-profile AI chip startups like Mythic and Graphcore, and the sharp drop in AI chip funding in 2023.

However, Uberti’s sales pitch is strong: “Video generation, audio-to-audio models, robotics, and other future AI use cases are only possible with faster chips like Sohu’s. The entire future of AI technology will depend on whether the infrastructure can scale.”

#Etched #developing #chip #runs #model

Leave a Reply

Your email address will not be published. Required fields are marked *