Sat. May 18th, 2024



It hasn’t really been too long since OpenAI showed off Sora, which impressed and frightened many people with its ability to make (somewhat) realistic video clips out of text prompts. AI image generation has been polished a lot over the past months, so as you might expect, the next natural step is video. Google is also coming out with its own video generation methods, with new AI models under the umbrella of Imagen 2 promising big things as well.



Google introduced Imagen 2, a family of models within its Vertex AI platform. Google came under fire for its image generation model within Gemini being a bit of a dumpster fire. It was removed, and while Gemini isn’t including Imagen 2 (at least not straight away), it does come with a series of improvements that make it all-in-all better for generating images or even video.

Enhancements to Imagen 2 include inpainting and outpainting features, allowing for image manipulation such as removal of unwanted elements or addition of new components. The most significant update, however, is the introduction of “text-to-live images,” enabling the creation of short videos from text inputs.


However, you should keep in mind that this is not Sora. Compared to existing video generation tools, Imagen 2’s capabilities might fall short in terms of resolution and customization options. We’ll have to see how well it does in real-life usage. It’s also a bit of a technicality, but this generates “live images,” which are short, 4-second clips. It’s still a start, however, and this could serve as a foundation for an actual text-to-video model within the next months or years.


To address concerns regarding deepfakes, Google incorporates SynthID technology to apply cryptographic watermarks to live images, aiming for authenticity and safety. Despite Google’s emphasis on safety measures, questions remain about the effectiveness of its approach and transparency regarding training data sources. The absence, for one, of an opt-out mechanism for creators whose work may be included in the training data might raise eyebrows for some. Additionally, Google’s generative AI indemnification policy does not cover text-to-live images, leaving customers vulnerable to potential copyright claims.

We’ll have to wait and see whether Google makes this publicly accessible in any way. We might hear more once Google I/O rolls around.

Source: TechCrunch



Source link

By John P.

Leave a Reply

Your email address will not be published. Required fields are marked *