A startup installed by Odissi, Self-Driving Pioneers Oliver Cameron and Jeff Hack has developed an AI model that lets users “interaction” with streaming videos.
Available on the web in a “early demo”, the model produces and streams video frames every 40 milliseconds. Through basic control, viewers can detect areas within a video, similar to a 3D render video game.
Odyssey explains Odyssey in a blog post, “Given the world's current status, an upcoming action and history of works, the model tries to predict the next state of the world.” “It is a new world model, which demonstrates capabilities such as pixel generation that feel realistic, maintains spatial stability, learning from video, and producing consistent video streams for 5 minutes or more.”
Many startups and large tech companies are following world models, including deepmind, influential AI researcher Fi-Lee's World Labs, Microsoft and Decart. He believes that the world model can be used to make interactive media a day, such as games and movies, and training environment for robots, such as realistic simulations.
But creative has mixed feelings about technology. A wired investigation recently found that game studios such as Activities Blizard, which have shut down the score of workers, are using AI to cut and fight corners. And a 2024 study by the Animation Guild, a union representing Hollywood animators and cartoonists, estimated that more than 100,000 US-based films, television and animation jobs would be interrupted by AI in the coming months.
For its share, Odissi is promising to collaborate with creative professionals – not to replace them.
“Interactive video … opens the door for completely new forms of entertainment, where stories can be generated and detected on demand, is free from traditional production obstacles and costs,” the company writes in its blog post. “Over time, we believe that today is everything that is video – entertainment, advertising, education, training, travel, and more – will develop in all interactive videos run by Odyssey.”
Odyssey's demo is slightly thicker around the edges, which the company accepts in its post. The atmosphere that produces the model is blurred and deformed, and is unstable in the sense that their layouts are not always the same. Proceed in one direction or turn around for a while, and the surroundings may look suddenly different.
But the company's model has been promised to improve rapidly, which can currently stream a video of 30 frames per second from the cluster of NVIDIA H100 GPU, which is from the cost of $ 1 to $ 2 per “user-hour”.
Odissi writes Odyssi in his post, “Further, we are doing research on a rich world representation, which holds dynamics more honestly, while temporary stability and continuously enhance the state.” “In parallel, we are expanding the action space from motion to world interaction, learning the open tasks from the video on a large scale.”
Odyssey is taking a different approach in world modeling space than several AI labs. It designed a 360-degree, backpack-mounted camera system to catch real-world scenarios, which Odyssey thinks that can serve as a base for high quality models than trained models on publicly available data.
To date, Odyssey has raised $ 27 million from investors, including EQT Ventures, GV and Air Street Capital. Ed Catamul, one of the co-founders of the pixer and is in the board of directors of the Walt Disney Animation Studio, the Board of Directors of Startup.
Last December, Odyssey stated that it was working on software that allows creators to load scenes generated by their models such as unrealistic engines, blender, and adobe after effects in devices such as they can be edited by hand.