I wrote up some notes on Gemma 3, Google's new openly licensed (not open source) vision LLM, released this morning. https://simonwillison.net/2025/Mar/12/gemma-3/
It seems to be a capable vision model - I got this result running gemma3:27b via Ollama on my Mac
@simon have you had any luck invoking function calls? It seems like the chat template doesn't include any special tokens for tool inputs and outputs
@redscroll structured data extraction with schemas via Llama seemed to work well, I've not tried the function calling style of tools yet
@simon by structured data extraction are you referring to constrained decoding like using grammars? Or just asking it to respond in a specified json schema via the prompt?
@redscroll Ollama has grammar support baked in that works against any model, I'm using that