Google I/O Conference Grand Opening on May 21
During the Google I/O conference on May 21, CEO Sundar Pichai announced several significant upgrades to the company’s AI offerings. Not only did they introduce the Ironwood TPU, which is 10 times faster than the previous generation, and Google Beam, which focuses on 3D immersive calling experiences, but also the Gemini App’s agent mode, which can directly assist with booking, property viewing, and itinerary planning. This showcases Google’s ambition to create a “universal AI assistant” integrated into everyday life.
Ironwood TPU Makes a Strong Debut, 10 Times Faster than Its Predecessor
Pichai first introduced their seventh-generation TPU, “Ironwood,” which boasts:
- Performance that is 10 times faster than the previous generation.
- A complete TPU Pod capable of processing 42.5 million trillion operations per second.
- Availability to Google Cloud users by the end of the year.
AI-Driven 3D Video Device, Google Beam Launched
Featuring an AI-driven 3D video device:
- Composed of 6 lenses.
- Can synthesize 3D light field images after capturing.
- Aims to make remote video calls feel like face-to-face conversations.
- The first batch of devices will be co-developed with HP and provided to initial users this year.
Real-Time Translation and Screen Sharing Now Available, Major Upgrade for Gemini
As part of Google’s actively constructed Gemini Live AI model, significant functional enhancements include:
- Real-time voice translation: Currently supports English and Spanish, with additional languages to be rolled out.
- Support for screen sharing and visual analysis: Can analyze the scene in front of the user in real-time, such as tracking your shadow if it mistakes a streetlight for a person.
- Available for Android and iOS users starting May 21.
Project Mariner Launched for AI Multitasking Agents via Gemini API for Developers
Pichai also announced that Google will soon open the multitasking agent Project Mariner, which can:
- Handle 10 tasks simultaneously.
- Learn and replicate task workflows.
- Be made available to developers via the Gemini API.
Gemini App’s New Agent Mode Can Help Find Properties and Plan Itineraries
The Gemini App, Google’s flagship AI application, has made impressive functional upgrades:
- Introduced an AI agent mode that can automatically search for properties and arrange viewing appointments.
- Can assist with making phone calls and booking itineraries.
- The agent mode will automatically search platforms like Zillow for listings, arrange viewings, and even make calls or book trips, supported by MCP to integrate other services.
- MCP acts as a bridge for Gemini to connect with various websites, apps, and service systems, evolving from merely conversational to a “doing agent.”
Gmail Begins Integrating Gemini to Auto-Reply for Users
As a typical email function, Gmail has also begun integrating Gemini, which:
- Reads the user’s past writing style, documents, and calendar through Gemini.
- Automatically generates replies.
- Will be available to subscription users this summer.
Gemini Flash and 2.5 Pro Major Upgrade, AI Program Assistant
The new version of the Gemini Flash model is faster and more capable than before, including:
- Launch of the 2.5 Pro “Deep Think” mode, capable of handling complex math problems and lengthy tasks.
- Officially launching in June.
- Supports 24 languages, can naturally shift tones, and includes bilingual modes—all integrated into the Gemini API.
- Developers can feed code screenshots to the 2.5 Pro, and the AI assistant Jules can help modify the code, with public testing available starting May 21.
New AI Models for Music and Video Released, AI Video Creation Platform Project Flow Launched
New models include:
- Imagine 4: A next-generation image AI generation model with improved accuracy in text processing and 10 times the generation speed, also managing typography.
- Veo 3: A new video generation model capable of integrating narration and ambient sounds.
- Lyria 2: An AI music generation model capable of producing high-quality music.
- Project Flow: A new AI video creation platform that allows users to freely generate or upload characters and scenes, and then let AI create visuals via text commands.
Comprehensive Integration with Chrome, Wear, and TV for Enhanced Search AI
Search AI has undergone a complete evolution, transforming the “AI Mode” into a true assistant:
- AI Mode: Capable of answering complex questions using charts, tables, and summary reports.
- Search Live: Enables interactive searching similar to video calls.
- Try-On Feature: Upload photos to simulate and compare clothing fit.
- One-Click Checkout: Price alerts, adding to the cart, and automatic checkout are all handled seamlessly.
- Gemini in Chrome: Can directly read page content for answers.
- Deep Research + Canvas: Allows you to upload reports and convert them into web pages, podcasts, or quizzes with one click.
- Integration of Gemini Live with Keep, Maps, and Calendar is underway.
Gemini Enters the XR Field, Collaborating with Samsung to Create AI Glasses and Headsets
Google is also collaborating with Samsung to create XR smart glasses, Project Muhan, which is expected to launch this year. Project Muhan will support voice, visual search, translation, navigation, and real-time response functions, developed in partnership with Warby Parker.
Ultra Subscription Plan and Global Expansion
Google AI Pro / Ultra: Pro offers higher usage limits, while Ultra allows early access to new features, along with YouTube Premium and large cloud storage. Features like 2.5 Pro Deep Think, Veo 3, and Flow will be prioritized for Ultra subscribers.
As CEO Pichai concluded, Gemini is evolving from a multimodal model into an “AI world model,” with Google’s vision being to create a true “universal AI agent” capable of helping humans with writing, problem-solving, video editing, outfit selection, auditioning, and even walking to find coffee shops, fully integrating into daily life.
Risk Disclaimer
Investing in cryptocurrencies carries high risks, with prices potentially fluctuating dramatically. You may lose all your principal. Please assess risks carefully.