AI video just got personal — Mochi-1 lets you train your own model on a few videos

2 hours ago 4

Genmo Mochi-1

(Image credit: Genmo Mochi-1/AI generated)

Genmo, the San Francisco-based open-source AI video lab, has just announced a new add-on for its Mochi-1 state-of-the-art video generation model.

The new fine tuner tool lets users customize their video output as they want, by seeding the target video with a modest number of additional training clips. The ability to fine-tune video output like this is not new, but this is the first time we have seen it released into the market as an open-source video product.

The tuning is done using standardized LoRA technology, which has long been used to fine-tune image models in order to create a desired output.

By using low-rank adapters like this, users can take a generalized model and customize it to taste. One example could be in creating product images with a specific logo and having them appear in videos.

What is Mochi-1?

Mochi 1 caused quite a stir when it launched because of its superb quality video output, so this latest development is a significant milestone in the race towards cinema quality, versatile video output. It is available from the Genmo website.

As with many recent AI announcements, the Mochi 1 fine-tuner is less of a mass-market product and more of a research experiment. This early iteration, while designed to work with one graphics card, will only work on systems with an expensive top-end graphics processor with at least 60 gigabytes of VRAM. Which will immediately put it out of the reach of ordinary mortals.

The launch demo suggests that you’ll need no more than a dozen video clips to fine-tune the model to your needs, which is a pretty impressive feat for video. But interested parties will also need to have a good deal of familiarity with coding and command line interfaces in order to get the system working. Not for the faint of heart, therefore.

Discover the hottest deals, best product picks and the latest tech news from our experts at Tom’s Guide.

The value of open-source

Open-source video seems to be the flavor du jure, judging by the amount of announcements coming down the pipe. The Allegro-T12V model dropped this week, another open-source video technology that holds promise.

It gives six seconds of 720p video from a text prompt, but the key feature is it all happens inside 9GB of VRAM, which sounds like an excellent use of space.

Again, there’s no fancy wrapper to make it easy for end-users at the moment, but hopefully, it will come soon.

In the meantime, I’m just gonna sit here with my jumbo box of hot buttered popcorn and keep looking towards the door for the arrival of Sora. Whatever happened to Sora? Anybody know? Any guesses? Sam? Any one?

More from Tom's Guide

Apple reportedly '2 years behind' on AI with Apple Intelligence — here's why
I write about AI for a living — here's how to become a true power user
Watch out, OpenAI — Haiper 2.0 AI video generation model just launched and it looks stunning

Nigel Powell is an author, columnist, and consultant with over 30 years of experience in the technology industry. He produced the weekly Don't Panic technology column in the Sunday Times newspaper for 16 years and is the author of the Sunday Times book of Computer Answers, published by Harper Collins. He has been a technology pundit on Sky Television's Global Village program and a regular contributor to BBC Radio Five's Men's Hour.

He has an Honours degree in law (LLB) and a Master's Degree in Business Administration (MBA), and his work has made him an expert in all things software, AI, security, privacy, mobile, and other tech innovations. Nigel currently lives in West London and enjoys spending time meditating and listening to music.

Read Entire Article