Google DeepMind's Genie 2 can generate interactive 3D worlds

Google DeepMind's Genie 2 can generate interactive 3D worlds

World models — AI algorithms capable of generating a simulated environment in real-time — represent one of the more impressive applications of machine learning. In the last year, there’s been a lot of movement in the field, and to that end, Google DeepMind announced <a data-i13n="cpos:1;pos:1" href="https://deepmind.google/discover/blog/genie-2-a-large-scale-foundation-world-model/">Genie 2 on Wednesday[/url]. Where its predecessor was limited to generating 2D worlds, the new model can create 3D ones and sustain them for significantly longer.
Genie 2 isn’t a game engine; instead, it’s a diffusion model that generates images as the player (either a human being or another AI agent) moves through the world the software is simulating. As it generates frames, Genie 2 can infer ideas about the environment, giving it the capability to model water, smoke and physics effects — though some of those interactions can be very gamey. The model is also not limited to rendering scenes from a third-person perspective, it can also handle first-person and isometric viewpoints. All it needs to start is a single image prompt, provided either by Google’s own <a data-i13n="cpos:2;pos:1" href="https://www.engadget.com/ai/googles-generative-ai-video-model-is-available-in-private-preview-160055983.html">Imagen 3 model[/url] or a picture of something from the real world.
<div id="dc85b25eb7a840cab09e396f1221ded2"><blockquote class="twitter-tweet">Introducing Genie 2: our AI model that can create an endless variety of playable 3D worlds - all from a single image.