Google Introduces PaliGemma, A New Visual Language Model
Written by Kay Ewbank   
Monday, 20 May 2024

Last week's Google I/O saw the introduction of PaliGemma, an open vision-language model (VLM), together with some details of what's coming in Gemma 2. 

Gemma is Google's lightweight open models that have been built from the same research and technology used to create Google's Gemini models. The existing models in Gemma are text-to-text, decoder-only large language models, available in English, with open weights, pre-trained variants, and instruction-tuned variants.



PaliGemma is described as a powerful open VLM inspired by PaLI-3. The language is built on open components, including the SigLIP vision model and the Gemma language model, and is designed a wide range of vision-language tasks, including image and short video captioning, visual question answering, understanding text in images, object detection, and object segmentation.

HuggingSpace Face Running Paligemma

Google is providing both pre-trained and fine-tuned checkpoints at multiple resolutions, as well as checkpoints specifically tuned to a mixture of tasks for immediate exploration.

PaliGemma is being released for various platforms and resources, with free options including Kaggle and Colab notebooks. Academic researchers seeking to push the boundaries of vision-language research can also apply for Google Cloud credits to support their work. The language joins CodeGemma and RecurrentGemma which were released earlier in the year.

Google also shared some information about Gemma 2, a 27B parameter instance that outperforms models twice its size and runs on a single TPUv5e. Gemma 2 will be available in new sizes and is based on a new architecture. Google says Gemma 2 will deliver performance comparable to Llama 3 70B at less than half the size. The updated design will mean it can fit on less than half the compute of comparable models, with the 27B model optimized to run on NVIDIA's GPUs.

Google also says Gemma 2 will provide developers with "robust tuning capabilities across a diverse ecosystem of platforms and tools".

Google PaliGemma is available now. 


More Information

Google Gemma

Related Articles

Google Releases Gemma Open Models

Gemini 1.5 Pro Now Available

Google Rebrands Bard With Subscription

Google Adds Gemini To Bard

Google Adds Code Generation To Bard

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.


WWII Cipher Machine As Used On D-Day

To celebrate the 80th anniversary of D-Day there's a new addition to the line up of cryptographic machines on Martin Gillow's VirtualColossus website - a 3D simulation of the Hagelin M-209 cipher [ ... ]

Microsoft And GitHub Announce Copilot Extensions At Build 2024

Microsoft's Build conference is underway with lots of announcements about Copilot, Microsoft's AI companion software.

More News

C book



or email your comment to:

Last Updated ( Monday, 20 May 2024 )