Why text-to-video may be the next ‘big’ AI thing

By Krishna Rao On Apr 2, 2023

When it comes to generative AI, there’s only one thing dominating the headlines — ChatGPT. However, there’s a lot more in the world of generative beyond ChatGPT-like language models. Text-to-image is already becoming a part of mainstream conversations but brewing in the background is generative AI capable of converting text to videos.
What is text-to-videos AI?
Simply put, you can generate AI-powered videos on the basis of nothing but your words. Yes, it’s exactly how it sounds like: key in the text and the AI model will generate a video based on it. US-based startup Runway showcased its Gen-2 model, which is able to do that with a caveat or two.
Is this a ‘new’ thing?
Not really as it is very much like Dall-E — developed by creators of ChatGPT — and works using generative AI language models. The results are captivating enough and it could certainly catch the fancy of many across the world.
Is ‘Big Tech’ not involved in text-to-video?
They very much are. Back in September 2022, Meta showcased a rather obviously named tool Make-A-Video. With just a few words or lines of text, Make-A-Video creates videos using generative AI but those videos didn’t have any sound. Here’s what Meta CEO Mark Zuckerberg had said about it: “It’s much harder to generate video than photos because beyond correctly generating each pixel, the system also has to predict how they’ll change over time.
Just a week later and on cue, Google announced a similar model. Google’s generative AI model is called Imagen Video. “Given a text prompt, Imagen Video generates high definition videos using a base video generation model and a sequence of interleaved spatial and temporal video super-resolution models,” is how Google had described it.
Google also showcased another model called Phenaki, which is aimed at creating long-form videos on the basis of text inputs.
What are the challenges with text-to-video AI?
Multifold. From operational to ethical, the challenges are far too many. Perhaps that’s one of the reasons why only demos of generative AI models working on text-to-videos have emerged. For starters, generating a video with text might sound ridiculously easy and equally fascinating but imagine making a video with just words. One will have to be incredibly precise with the commands or it could generate the video equivalent of gibberish.
Then comes the ethical challenges. AI-generated videos could be the next weapon in the misinformation arsenal. Deepfakes could become an even bigger problem that is currently encountered.
Considering the fast-paced developments in the field of AI, it could be a matter of time before text-to-video gets out of exploration mode and become rather mainstream.

TEMU Affiliate Program 2024: Earn Up to £100,000 a month!

Mar 28, 2024

How is eBPF Revolutionizing Kubernetes Scaling

Jan 20, 2024

Read original article here

Denial of responsibility! TechnoCodex is an automatic aggregator of the all world’s media. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, all materials to their authors. If you are the owner of the content and do not want us to publish your materials, please contact us by email – [email protected]. The content will be deleted within 24 hours.