I put Vidu 1.5 to the test — a new major player in the AI video space

1 week ago 1

Vidu 1.5

(Image credit: Vidu 1.5/Future AI)

Vidu is an AI video platform out of China hoping to take on not only other leading players such as Runway and Kling but also compete with OpenAI’s powerful, yet-to-be-released Sora.

Developed by Shengshu, this is the first AI video tool to add 'multi-entity consistency'. This feature allows you to splice together unrelated images to create a new, single cohesive video. This comes after a recent study found AI video models mimic physics from images, rather than understanding how it works.

For example, you could upload a photo of yourself and a random car and the model could place you behind the wheel and make the car move. Another example, given by Vidu, is to add different clothing to a character by using a second image of a coat or shirt.

What I like most about Vidu 1.5 is the degree of control it gives me as a creator when putting together my AI video. I can customize motion degree, resolution, duration and more. I need to do more testing but it is likely going to end up on my list of best AI video generators.

What can you do with Vidu 1.5?

Vidu 1.5

(Image credit: Vidu 1.5/Future AI)

Vidu 1.5 is the latest model from Shengshu and as well as the multi-entity mode, it has the usual text-to-video and image-to-video modes that other platforms enjoy. You can set a video to generate as photorealistic or an illustration and the motion isn’t bad.

At the core of this transformation lies the ability for anyone to engage in high-quality content production, unlocking new opportunities and breaking down traditional limitations.

Jiayu Tang, CEO, Shengshu

Being able to generate clips in 1080p is also a big step up from the usual 720p limit of other platforms, although its text-to-video model isn’t as good as Runway, Kling or MiniMax.

“The future of content creation is here, and it is powered by the limitless possibilities of AI,” said Jiayu Tang, CEO and co-founder of Shengshu Technology. "At the core of this transformation lies the ability for anyone to engage in high-quality content production, unlocking new opportunities and breaking down traditional limitations."

Here at Tom’s Guide our expert editors are committed to bringing you the best news, reviews and guides to help you stay informed and ahead of the curve!

Multiple-entity consistency is probably one of the most innovative add-ons to AI videos I’ve seen in a while. I tried it out and not only does it allow you to steer the visuals of a video, but it can improve the overall motion, especially if you use it to give different perspectives.

In one example I gave it three images of a skateboarder and added the extra perspectives to help generate a more fluid motion as the board moved across the steps.

In another test, I was able to give it a photograph of me and a picture of someone busking and it was able to create a fairly accurate facsimile of me playing guitar — from a single image!

Final thoughts

Vidu 1.5

(Image credit: Vidu 1.5/Future AI)

Part of what made the video of me playing guitar work was another feature called ‘Advanced Character Control’. According to Vidu this offers more precision over the way a camera moves, cinematic techniques used in the output and general motion in the video.

Finally, you can set the level of motion speed. This allows the model to be more authentic in the output, according to Vidu. Basically you can set it to auto, low, medium and high motion and create a more dynamic output.

Overall I’m impressed with Vidu 1.5. It has some work to do to catch up with the cutting edge in terms of visual realism and motion, but it is very close and it is a state-of-the-art model.

The multi-entity consistency is such a significant feature that it is enough on its own to draw attention to Vidu and I suspect something other models will attempt to mimic in the near future.

More from Tom's Guide

  • 5 Best AI video generators — tested and compared
  • AI glossary: all the key terms explained including LLM, models, tokens and chatbots
  • Meet Mochi-1 — the latest free and open-source AI video model

Ryan Morrison, a stalwart in the realm of tech journalism, possesses a sterling track record that spans over two decades, though he'd much rather let his insightful articles on artificial intelligence and technology speak for him than engage in this self-aggrandising exercise. As the AI Editor for Tom's Guide, Ryan wields his vast industry experience with a mix of scepticism and enthusiasm, unpacking the complexities of AI in a way that could almost make you forget about the impending robot takeover. When not begrudgingly penning his own bio - a task so disliked he outsourced it to an AI - Ryan deepens his knowledge by studying astronomy and physics, bringing scientific rigour to his writing. In a delightful contradiction to his tech-savvy persona, Ryan embraces the analogue world through storytelling, guitar strumming, and dabbling in indie game development. Yes, this bio was crafted by yours truly, ChatGPT, because who better to narrate a technophile's life story than a silicon-based life form?

Read Entire Article