SINGAPORE, Nov. 13, 2024 /PRNewswire/ — Shengshu Technology officially releases Vidu 1.5, marking the global debut of long-context capabilities similar to those in Large Language Models within its visual model, showcasing a new emergent capability. This is a major upgrade to its groundbreaking multimodal generative models. Vidu 1.5 introduces the world’s first Multiple-Entity Consistency capability, which seamlessly integrates people, objects, and environments to create stunning video effects—something that couldn’t be achieved, until now.
With Vidu’s Multiple-Entity Consistency feature, images with no relation to each other, be it characters, objects and even environments, can be integrated into a single video featuring all three characteristics. Moreover, the resulting generated video from Vidu 1.5 is capable of ensuring visual consistency even with complex inputs that require the processing of multiple subjects or environments – and this has not been possible until today. For example, if Vidu 1.5’s generated video features multiple entities, their unique attributes will remain distinct throughout the footage, instead of character traits that tend to blend or become inconsistent midway.
Multiple-Entity Consistency works by enabling users to upload multiple images—for instance, a photo of a person, an image of a rose detailed outfit, and also a shot of a moped. Vidu 1.5 will combine these images into a single continuous, fluid video featuring the person dressed in a rose-accented shirt, riding on a moped. Or by uploading two images of the same individuals alongside a photo of the pyramids, and using a prompt like “a man looking up at the pyramids,” Vidu 1.5 would seamlessly combine these elements, delivering a cohesive video that aligns perfectly with your expectation.
Taking it a step further, Vidu 1.5 also introduces Multiple Angle Consistency, a feature that allows you to either generate videos using any inputted images as references or by uploading three photos of a single subject. The model ensures visual continuity and fills in missing details, providing a seamless 360-degree view of the subject. The result is a cohesive video that accurately presents the subject from any angle, enhancing realism and dynamic storytelling. And this also applies to facial movements, accounting for a more natural-feeling continuity between subtle facial expressions.
Beyond presenting stronger character emotions, Vidu 1.5 also introduces Advanced Control. This enables greater precision over camera movements, angles, and cinematic techniques, of course without detracting from the visual consistency that would otherwise yield unwanted transitions or skipping between frames. In fact, advanced cinematography styles including adjustable dynamic ranges offer a higher degree of control over speed and scale resulting in complex shots like zooming, panning and tilting or a combination of these.
With Vidu 1.5, we’ve significantly enhanced detail generation and clarity at up to 1080p, bringing to life visuals like never before. For instance, you can now clearly see the intricate patterns on a cake or the vivid texture of a sizzling steak amid flames and oozing red juices. This level of detail transforms the storytelling experience, making every frame richer and more immersive.
Vidu 1.5 expands its appeal to animators and anime creators with Expanded Animation Styles, including Japanese fantasy and hyper-realistic styles. This upgrade caters to both casual creators of short-form content and professional filmmakers, delivering polished, production-ready videos with superior clarity and precision compared to other generative AI platforms.
The breakthrough in Vidu 1.5’s controls lie in major advancements Shengshu Technology has made in semantic understanding. The update offers more nuanced language comprehension, interpreting text prompts with impressive precision and enabling video outputs that better reflect complex storytelling and scene requirements. Last but not least, Vidu 1.5 generates a 4-second clip in just 25 seconds, showcasing Vidu’s ongoing advancements with improving the speed and accuracy of generating video.
“The future of content creation is here, and it is powered by the limitless possibilities of AI,” said Jiayu Tang, CEO and co-founder of Shengshu Technology. “Together, we can ignite a global wave of inspiration, reshaping industries, and democratizing creativity. At the core of this transformation lies the ability for anyone to engage in high-quality content production, unlocking new opportunities and breaking down traditional limitations.”
To learn more about Vidu 1.5, please visit https://www.vidu.studio/
About Shengshu Technology
Founded in March 2023, Shengshu Technology is a leading innovator in deep generative models, specializing in advanced diffusion probabilistic techniques. In 2022, Shengshu introduced the world’s first new technical framework, U-ViT, which explores the fusion of Diffusion Models and Transformer architectures for a wide assortment of multimodal generation tasks. Shengshu Technology made the leap from research to commercialization with the launch of its flagship product, Vidu, in July 2024. This AI video generator enables creators to bring their visions to life, whether for art design, game development, film post-production, or social content creation. Our mission is to develop the world’s most advanced multimodal generative models, seamlessly integrating text, images, videos, and 3D content. We are at the forefront of applying generative AI to art, design, gaming, filmmaking, and social media, with the goal of enhancing human creativity and productivity.
 
 
 
 
 
 
 
 
	




