This page contains supplementary demos for our paper “Unrolling Virtual Worlds for Immersive Experiences” (paper, poster) presented at NeurIPS 2023 Workshop on Machine Learning for Creativity and Design. We explore the idea of generation of navigable 3D worlds using the combination of modern neural networks and specific geometric transformations to achieve interactive, locally coherent worlds.
First, we utilize the fine-tuned StableDiffusion v1.5 model to generate panoramas in equirectangular projection, similar to BlockadeLabs and latentlabs360. We then process the projection by applying 3D transformation to generate an image corresponding to the projection from another point in the same space. Finally, after obtaining the distorted projection, we “heal” this distortion by running the image through the model again, using a technique similar to the “in-painting.” The network tends to “reconstruct” inputs, thereby enhancing their likelihood, grounding various noisy or distorted objects onto this manifold and projecting them to the nearest suitable region (read more details in the paper). By repetition of these three steps, we are able to construct interactive, locally coherent worlds with graphs of available locations, similar to the legendary Myst game or Google Street View interface. To render these interactive spaces, we use an open-source library called Pannellum.
Interactive Demo Spaces
- a maze of twisty little passages, all alike
- a dark, mysterious cave filled with bright, glowing green crystals and large, looming statues
- an outdoor desert landscape, with large sand dunes and jagged rocks
Some screencasts
© altsoph, 2023