Google Introduces PaliGemma, A New Visual Language Model
Written by Kay Ewbank   
Monday, 20 May 2024

Last week's Google I/O saw the introduction of PaliGemma, an open vision-language model (VLM), together with some details of what's coming in Gemma 2. 

Gemma is Google's lightweight open models that have been built from the same research and technology used to create Google's Gemini models. The existing models in Gemma are text-to-text, decoder-only large language models, available in English, with open weights, pre-trained variants, and instruction-tuned variants.

 

gemma

PaliGemma is described as a powerful open VLM inspired by PaLI-3. The language is built on open components, including the SigLIP vision model and the Gemma language model, and is designed a wide range of vision-language tasks, including image and short video captioning, visual question answering, understanding text in images, object detection, and object segmentation.

HuggingSpace Face Running Paligemma

Google is providing both pre-trained and fine-tuned checkpoints at multiple resolutions, as well as checkpoints specifically tuned to a mixture of tasks for immediate exploration.

PaliGemma is being released for various platforms and resources, with free options including Kaggle and Colab notebooks. Academic researchers seeking to push the boundaries of vision-language research can also apply for Google Cloud credits to support their work. The language joins CodeGemma and RecurrentGemma which were released earlier in the year.

Google also shared some information about Gemma 2, a 27B parameter instance that outperforms models twice its size and runs on a single TPUv5e. Gemma 2 will be available in new sizes and is based on a new architecture. Google says Gemma 2 will deliver performance comparable to Llama 3 70B at less than half the size. The updated design will mean it can fit on less than half the compute of comparable models, with the 27B model optimized to run on NVIDIA's GPUs.

Google also says Gemma 2 will provide developers with "robust tuning capabilities across a diverse ecosystem of platforms and tools".

Google PaliGemma is available now. 

gemma

More Information

Google Gemma

Related Articles

Google Releases Gemma Open Models

Gemini 1.5 Pro Now Available

Google Rebrands Bard With Subscription

Google Adds Gemini To Bard

Google Adds Code Generation To Bard

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

Banner


David Padua Recognized With HPC Award
27/09/2024

David A. Padua is the recipient of the 2024 ACM-IEEE CS Ken Kennedy Award which recognizes achievements in parallel and high performance computing (HPC). Padua is cited for innovative and us [ ... ]



Microsoft Releases Azure AI Inference SDK For .NET
26/09/2024

Microsoft has introduced an Azure AI Inference SDK for .NET. The SDK can be used to access and use AI models from the Azure AI Model Catalog for inference tasks like chat in .NET applications.


More News

kotlin book

 

Comments




or email your comment to: comments@i-programmer.info

Last Updated ( Monday, 20 May 2024 )