AI for Business

Apple's New AI Model Gives Siri the Ability to 'See' Your iPhone Screen

A new research paper from Apple, in collaboration with Columbia University, details significant progress in a specialized artificial intelligence model called Ferret-UI. This technology is...

Share:

A new research paper from Apple, in collaboration with Columbia University, details significant progress in a specialized artificial intelligence model called Ferret-UI. This technology is designed to understand and interpret what is displayed on a smartphone screen, a development with direct implications for a smarter, more capable Siri. The model processes images of mobile interfaces, learning to identify icons, text, buttons, and layout structures with high accuracy. This moves beyond simple voice commands, potentially allowing Siri to perform actions within apps by seeing the screen just as a user would.

For example, a command like "use this photo as my new wallpaper" or "find the concert tickets in my email" could become a seamless, visual interaction. Industry analysts see this as a key part of Apple's strategy for iOS 18, expected to be previewed at the company's Worldwide Developers Conference next month. Unlike AI services that rely heavily on cloud servers, Apple's approach emphasizes on-device processing. This method prioritizes user privacy and reduces response time, aligning with the company's established stance on data security.

While competitors like Google and Microsoft are also developing multimodal AI, Apple's integration of such a model directly into the iPhone's operating system could offer a uniquely fluid experience. The research indicates Ferret-UI outperforms other models in specific tests related to screen understanding. If successfully integrated, this could transform Siri from a voice-activated tool into a true visual assistant, capable of navigating apps and executing complex, multi-step tasks based on what it sees on the display.

Source: Webpronews

Ready to Modernize Your Business?

Get your AI automation roadmap in minutes, not months.

Analyze Your Workflows →