November 22, 2024
Mistral unveils Pixtral 12B, a multimodal AI model that can process both text and images

Mistral unveils Pixtral 12B, a multimodal AI model that can process both text and images

Mistral AIa Paris-based artificial intelligence startup, today unveiled its latest advanced AI model that can process both images and text.

The new model, called Pixtral 12B, uses approximately 12 billion parameters and is the first model of its kind capable of vision encoding, making it possible to “see” images alongside text.

The new model is based on Mistral’s Nemo 12Ban AI model previously released by the company that can understand text, with the addition of a 400 million-parameter vision adapter. The adapter allows users to add images via URLs or base64 encode them within the input text.

Many other AI large language models have also added multimodal capabilities that allow users to input images, such as Anthropic PBC’s Claude family, OpenAI’s GPT-4o, and Google LLC’s Gemini. The addition of image reasoning capabilities to Pixtral 12B should give it the ability to similarly answer questions about images, provide captions, count objects, and more.

The company has released the parameters and code via a torrent link on GitHub and the AI ​​distribution platform Cuddling faceThe company has encouraged developers to download and use it.

Now that the model is available for download, developers can refine and train the model for their own purposes. The company offers some of its models open-source under the Apache 2.0 license without restrictions. For others, Mistral offers a dev license that is free for development, but requires a paid license for commercial applications, but not for research purposes. The company did not clarify which license Pixtral 12B will be under.

Sophia Yang, Mistral’s head of developer relations, said in a post on XThat the model will soon be available for testing on the chatbot and application programming interface platforms of Mistral, Le Chat and Le Platforme.

Image: Pixabay

Your support is important to us and helps us keep our content FREE.

With one click below you can support our mission to provide free, in-depth and relevant content.

Join our community on YouTube

Join the community of over 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, ​​Dell Technologies Founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more leaders and experts.

“TheCUBE is an important partner to the industry. You are truly part of our events and we really appreciate you coming and I know people appreciate the content you create” – Andy Jassy

THANK YOU

Leave a Reply

Your email address will not be published. Required fields are marked *