Google’s AI mode search now answers questions about images with multimodal input

Spread the love

Google’s AI Mode introduces image-based search queries, using Google Lens and Gemini LLM to enhance user experience with multimodal input capabilities.

Google has expanded its AI Mode search functionality to include image-based queries, leveraging Google Lens and a custom Gemini LLM. This update allows users to upload images and receive contextual answers, marking a significant step in multimodal search technology.

Google’s AI Mode Embraces Multimodal Search

Google has announced a major update to its AI Mode search, enabling users to ask questions about images they upload. This feature, revealed in a press release on Google’s official blog, combines the power of Google Lens with a custom Gemini large language model (LLM) to interpret visual content and provide relevant answers.

According to Google’s VP of Search, this development represents “a leap forward in making search more intuitive and accessible.” The company highlighted that the technology is particularly useful for identifying objects, translating text within images, and providing context about landmarks or artwork.

How It Works

The new functionality works similarly to existing image search tools but with enhanced AI capabilities. Users can now upload an image and ask specific questions about its content. For example, one might upload a photo of a plant and ask “What species is this?” or share a picture of a historical monument with the query “When was this built?”

Industry analysts note that this update positions Google ahead of competitors in the multimodal AI space. A tech analyst from Forrester commented, “Google’s integration of visual and linguistic AI models sets a new standard for search engines in 2025.”

Future Implications

The update suggests Google’s continued focus on AI-driven search innovations. With multimodal input becoming standard, experts predict further integration of voice, image, and text search capabilities across Google’s ecosystem. The company has hinted at upcoming features that will allow for even more complex multimodal queries in future updates.

Happy
Happy
0%
Sad
Sad
0%
Excited
Excited
0%
Angry
Angry
0%
Surprise
Surprise
0%
Sleepy
Sleepy
0%

Staffing crisis at NIST threatens U.S. chip resurgence and AI ambitions

Framework pauses US laptop sales due to new tariffs on Taiwanese imports

Leave a Reply

Your email address will not be published. Required fields are marked *

9 + 18 =