Google is having a big year, changing the name of its AI chatbot from Bard to Gemini and releasing some new AI models. At this year’s Google I/O developer conference, the company made several other announcements about artificial intelligence and how it can be integrated into various enterprise applications and services.
As expected, artificial intelligence was the main area at the event, with the technology used in almost all Google products, from Search, unchanged for decades, to Android 15 and right, Gemini. Here’s a summary of all the big announcements made at the conference.
Gemini
It wouldn’t be a Google developer conference if the company didn’t launch a new major language model (LLM), and this year’s latest model is Gemini 1.5 Flash. The charm of this Google feature is that it is the fastest Gemini model available on the API and is more expensive than the Gemini 1.5 Pro while still being powerful.
Gemini 1.5 Flash is available in public preview starting today in Google’s AI Studio and Vertex AI. Although Gemini 1.5 Pro was launched in February, it has been improved to be more responsive in different areas, including translation, resolution, indexing, etc.
According to Google, the new version has achieved significant improvements on many indicators, including MU, Math Vista, Chart QA, Doc VQA, Infographic VQA, etc. In addition, Gemini 1.5 Pro with 1 million contextual windows will be available to customers at Gemini Advanced. This is important because customers can get AI help for large tasks, such as 1,500-page PDFs. Gemini Nano, Google’s model designed to run on smartphones, has been expanded to include images in addition to text.
According to Google, starting with the Pixel, the apps that use the Gemini Nano with multi-capabilities can recognize sight, sound, and voice. Gemini’s flagship model series, the Gemma, has also seen a major update with the launch of the Gemma 2 in June. The next generation Gemma optimized for TPU and GPU will be launched with 27B parameters. Finally, Google’s first visual language model, Pali Gemma, has been added to the Gemma family of models.
Google Search
If you access the Search Generation Experience (SGE) through Search Engines, you will be familiar with the AI Insights feature, which populates the top of search results with AI insights to provide concise answers to user searches. question. Now, access to the Google feature is not limited to only Hunt Labs, as starting today, it will be available to everyone in the United States. This feature is enabled by a new Gemini model that is optimized for Google search.
According to Google, the feature has been used billions of times since the introduction of artificial intelligence through Search Engines, making people use it more to search and get better at what they do. product. The implementations in Google Search are designed to provide a good user experience, which only happens if it can be added to the search results. Another big change in search is AI-driven product pages, which use artificial intelligence to create unique titles to better meet users’ search needs.
According to Google, AI-driven search will begin to change English-language searches related to inspiration in the US, starting with food and recipes, followed by movies, music, books, hotels, shopping, and more. Google is also launching a new search feature that will launch first in Search Engines. For example, in Search Engines, users can customize their AI insights according to their needs, with options to further group information or simplify the language, according to Google. Users can also search using video, and take the search view to the next level. This feature is available in the English version of Search Labs. Finally, starting today, Rapu can prepare food and travel with you in Rapu Labs (English) in the United States.
Text-to-video generator
Google is no stranger to artificial intelligence models for text-to-video. Google just shared a research paper on the Lumiere model in January. Now, the company has launched its most powerful model, the Veo, which can produce a high-quality 1080p recording for more than a minute. Google says that the better the model can understand natural language, the better the videos will be from the user’s perspective. He also knows the terms of the film like “time-lapse” to produce different types of videos and has the power to the users for the final product.
According to Google, it has many years of cooperation in video production, including Lumiere and other famous models such as Imagen-Video, Video Poet, and others. This model is not yet available to customers; however, this model is available as a private preview in Video FX for select developers, and the public is invited to join the waiting list.
Imagen 3
Google also launched Imagen 3, its next-generation text-to-image rendering. Picture. Like Veo, Imagen 3 has enhanced natural language capabilities to better understand user commands and the intent behind them.
This Google feature can solve one of the biggest challenges faced by AI image producers – text, and Google says that Imagen 3 is the best choice for producing text. Imagen 3, which is not yet available, offers a unique look in Image FX for select content creators. This model will be available on Vertex AI, and people can register to join the waiting list.
Update SynthID
In the era of the AI generation that we live in, we see companies focusing on many AI models. To adapt its AI watermarking tools, it is expanding SynthID (Google’s technology for AI watermarking of images) to two new formats – text and video. In addition, Google’s new text-to-video model, Veo, embeds SynthID watermarks on all videos created by the platform.
Ask for photos
If you’ve spent hours scrolling through news reports to find the image you’re looking for, Google has a clever solution to your problem. With Gemini, users can use the information in Google Photos to find the photos they are looking for.
In the example, the user wanted to see their daughter’s progress as a swimmer over time, so they asked Google Photos this question, and it automatically collected important things for them. The feature is called “Photo Queries,” which it says will be available this summer, with more features to come.
Gemini Premium Upgrade (with Gemini Live)
In February this year, Google launched a premium subscription plan for its Gemini Advanced chatbot, which allows users to gain additional benefits such as access to its latest intelligent models and long conversations. Now Google is improving its products for customers with unique experiences.
The first, as mentioned above, is access to Gemini 1.5 Pro, which allows users to access a larger context window of 1 million characters, which Google says is the largest number of customer discussions in the market.
Use the larger window to upload larger data, such as documents up to 1,500 pages or 100 emails. Soon, it can handle an hour of video and a code base of up to 30,000 lines. Moving on, one of the highlights of the entire launch is Google Gemini Live, a new mobile experience that allows users to fully interact with Gemini, choose from a variety of natural-sounding voices, and stopped in the middle of the conversation. .
Later this year, users will also be able to use the camera in Life to send the Gemini context for the world around them for these conversations. Gemini uses visual recognition capabilities from Project Astra, Google DeepMind’s effort to innovate the future of artificial intelligence assistants. For example, the Astra demo shows users pointing at a window and asking Gemini which environment they are in based on what they see.
Gemini Live is Google’s latest take on OpenAI’s new voice mode in ChatGPT, which the company announced at its spring update event, and allows users to have a full conversation with ChatGPT, which is interrupted in the middle of the conversation, changing the chatbot’s language, and using the user’s language.
AI was updated in Android
Some of the announcements made today come (unsurprisingly) to Google’s mobile platform, Android. First, Circle to Search, which allows users to perform searches by circling images, videos and text on the phone screen, can now “help students complete during their homework” (read: you circle). Google says the feature will work on a variety of subjects from math to physics, and will eventually be able to handle complex problems like symbolic models and diagrams.
Gemini can also replace Google Assistant as the default AI assistant on Android phones by installing it, which can be accessed by long pressing the power button. Finally, Gemini integrates a wide range of services and applications, with multimodal support available on demand. Gemini Nano’s capabilities are also leveraged through Android’s TalkBack feature to provide descriptive feedback for blind or visually impaired users.
Google Workspace Gemini Update
With all the updates to Gemini, Google Workspace couldn’t do without its own AI improvements. First, the Gemini sidebars for Gmail, Docs, Drive, Slides, and Sheets are updated to Gemini 1.5 Pro. This is important because, as mentioned above, Gemini 1.5 Pro gives users a longer context window and more advanced views, which users can now use on the side of Some of the most popular Google Workspace apps to help with updates. This experience is available to Workspace Alpha users in Workspace Labs and Gemini. With the addition of Gemini for Workspace and Google One AI Premium Plan users will be able to see on the desktop next month.
Gmail for mobile has three new useful features: snippets, Gmail Q&A, and contextual answers. The Summary section does what its name suggests, which is to summarize email series using Gemini. This feature will be available to users starting this month. Gmail Q&A allows users to discuss the context of their emails with Gemini within the Gmail mobile app. For example, in the demo, users asked Gemini to compare roof repair information based on price and availability.
Gemini pulls information from different inboxes and displays it to the user, as shown below. Contextual Replies is a smarter auto-responder feature that uses the context of Gemini email threads and conversations to gather responses. Gmail Q&A and Quick Answer Context will be rolling out to Labs users in July. Finally, the Write Help feature in Gmail and Docs is supported in Spanish and Portuguese, and will be coming to desktop in the coming weeks.