Elon Musk’s xAI has officially launched its first-generation multimodal model that can understand documents, translate code, and handle real-world situations.

The tool, called Grok -1.5V, is said to have “powerful text capabilities” and will soon be available to early testers and existing Grok users.

The update comes a week after the public release of Grok-1, which concluded its pre-training phase in October 2023.

“Grok-1.5 features improved inference capabilities and a context length of 128,000 tokens,” the company said in a blog post on the xAI website.

This long context understanding is a new feature that will increase Grok’s memory capacity to 16 times the previous context length. This means it will be able to leverage information from longer documents as well as more complex prompts.

The model will still work by following instructions, but will now be able to understand documents, scientific diagrams, diagrams, screenshots and photos. It can also convert charts into Python code.

Grok-1.5V can understand the real world

“In order to develop useful real-world AI assistants, it is crucial to improve the model’s understanding of the physical world. To achieve this goal, we are introducing a new benchmark: RealWorldQA.” said the team behind Grok-1.5V.

This benchmark will be used to evaluate the real-world spatial understanding capabilities of multimodal models. The team provided examples including asking Grok which direction a car can turn and which object is the largest in a tiled photo.

The initial version of the benchmark includes more than 700 photos, all with questions or easily verifiable answers.

Looking to the future, the team describes the need to upgrade multimodal models: “Improving our multimodal understanding and generation capabilities are important steps toward building beneficial AGIs capable of understanding the universe.

“In the coming months, we expect to make significant improvements to both capabilities in various modes including image, audio and video.”

Featured Image: Via Ideographies


#Elon #Musks #xAI #previews #Grok1.5V #multimode #model

Leave a Reply

Your email address will not be published. Required fields are marked *